Wikitech
labswiki
https://wikitech.wikimedia.org/wiki/Main_Page
MediaWiki 1.47.0-wmf.4
first-letter
Media
Special
Talk
User
User talk
Wikitech
Wikitech talk
File
File talk
MediaWiki
MediaWiki talk
Template
Template talk
Help
Help talk
Category
Category talk
Obsolete
Obsolete talk
OfficeIT
OfficeIT talk
Tool
Tool talk
Nova Resource
Nova Resource Talk
Heira
Heira Talk
TimedText
TimedText talk
Module
Module talk
Deployments
0
4108
2421405
2421330
2026-05-30T17:10:35Z
ScheduleDeploymentBot
37566
Add [[gerrit:1295531]] to Monday, June 01 UTC morning backport window
2421405
wikitext
text/x-wiki
{{Navigation MediaWiki deployment}}
This page tracks '''upcoming''' '''deployments''' of software to the [[:m:Special:SiteMatrix|Wikimedia Foundation servers]].
== Getting started ==
Ensure you joined the {{irc|wikimedia-operations}} IRC channel as all deployment-related communications happen there.
If you need help, contact [[:mw:Wikimedia Release Engineering Team|Release Engineering]] on IRC at {{irc|wikimedia-releng}}; and ping Tyler (<code>thcipriani</code>).
* '''MediaWiki is deployed weekly''' through the [[/Train|Deployment Train]]. Other services follow their own schedule.
* '''Times are pinned to San Francisco''', thus the UTC time changes in March and November per [[:en:Daylight saving time in the United States|DST]].
* '''Prefer regular [[Backport windows]]''' over adding new windows. To request deployment of a config change or backport, add your username and Gerrit URL to one of the backport windows on this page. You must be online in #wikimedia-operations on IRC during your deployment and install [[WikimediaDebug]] ahead of time. The #wikimedia-operations channel requires you to [[:m:IRC/Instructions#Register your nickname, identify, and enforce|register your nickname]] before you can join.
** You can use the '''backport scheduling tool''' to more easily edit this page: <div style="text-align: center; margin: 1em 0">{{Clickable button 2|:toollabs:schedule-deployment|Schedule a backport|class=mw-ui-progressive}}</div>
* Tasks that meet [[/Inclusion criteria|Inclusion criteria]] '''require their own windows''', which includes long-running tasks. '''Schedule more time''' than you think you need to account for delays and set backs, we recommend one hour for most tasks.
**To create or modify a recurring deploy window, send a patchset to [[:gitlab:repos/releng/release/-/blob/main/make-deployment-calendar/deployments-calendar.yaml|deployments-calendar.yaml file]] in <code>repos/releng/release.git</code>.
**To create an one-off window, simply edit this page accordingly
** '''Announce''' changes to the [[mail:ops|ops mailing list]] ahead of time if you anticipate or are uncertain about noticeable impacts to database load, HTTP caching, or the introduction of new cookies.
** '''Announce''' deployments of major features to the community via [[:m:Tech/News/Next|Tech News]] and/or via other [[:mw:Wikimedia_Product_Guidance/Communication_channels|Product communication channels]].
* '''Something went wrong?''' See [[Incident response]]. Is there a user-impacting problem? Communicate in the {{irc|wikimedia-operations}} IRC channel. If there is a Phabricator task, ensure [[:phab:tag/wikimedia-incident/|#Wikimedia-Incident]] is tagged, and consider setting the [[:mw:Phabricator/Project_management#Priority_levels|Unbreak Now]] priority.
__TOC__
{{anchor|Next Week|Near Term|Near term|Near-term}}{{clear}}
[[Category:Deployment]]
{{Note|content=Subscribe in Google Calendar via <code>wikimedia.org_rudis09ii2mm5fk4hgdjeh1u64@group.calendar.google.com</code>.<br>This may not include one-off windows. '''If there are differences, then the wiki page is canonical and correct'''.}}
==Week of June 01==
==={{Deployment_day|date=2026-05-31}}===
{{Deployment calendar event card
|when=2026-05-31 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-06-01}}===
{{Deployment calendar event card
|when=2026-06-01 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|WMDE-Fisch|WMDE-Fisch}}
{{deploy|type=1.47.0-wmf.4|gerrit=1294826|title=Update VE core submodule to master (9cf5524e7)|status=}} - {{phabricator|T424232}}
{{ircnick|atsukoito|atsukoito}}
{{deploy|type=config|gerrit=1294949|title=translate: adding separate read/write endpoints|status=}} - {{phabricator|T425377}}
{{ircnick|xxb|xxb}}
{{deploy|type=config|gerrit=1295531|title=Enable AbuseFilter block action on nlwiki|status=}} - {{phabricator|T427384}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-01 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-01 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-01 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-01 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-06-01 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-01 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-01 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|sfaci|sfaci}}
{{deploy|type=config|gerrit=1285412|title=Remove `wgTestKitchenExperimentStreamNames`|status=}} - {{phabricator|T422358}}
{{ircnick|RoanKattouw|RoanKattouw}}
{{deploy|type=1.47.0-wmf.4|gerrit=1295504|title=passwordlessLogin: Don't immediately error out in unsupported browsers|status=}} - {{phabricator|T427562}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-01 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|alexsanford|Alex}}, {{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-06-01 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-01 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.47.0-wmf.5</code>
}}
{{Deployment calendar event card
|when=2026-06-01 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.47.0-wmf.5</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-06-01 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-06-01 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-01 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-02}}===
{{Deployment calendar event card
|when=2026-06-02 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-02 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-02 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-02 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-02 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-06-02 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-02 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-06-02 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-02 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-02 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.4->1.47.0-wmf.5|1.47.0-wmf.4|1.47.0-wmf.4}}
* group0 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-02 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-02 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-02 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-03}}===
{{Deployment calendar event card
|when=2026-06-03 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-03 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5|1.47.0-wmf.4}}
* group1 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-03 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-03 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-06-03 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-03 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-03 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-03 08:00 SF
|length=1
|window=Jenkins switchover/upgrade
|who={{ircnick|mutante|Dzahn}}, {{ircnick|hashar|AMusso}}
|what=Moving Jenkins to Java 21 host, upgrading Jenkins agents, '''Gerrit CI downtime'''.
}}
{{Deployment calendar event card
|when=2026-06-03 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-03 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5|1.47.0-wmf.4}}
* group1 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-03 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-03 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-03 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-03 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-03 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-04}}===
{{Deployment calendar event card
|when=2026-06-04 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-04 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5}}
* group2 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-04 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-04 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-04 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-04 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-04 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-06-04 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-04 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-04 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-04 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5}}
* group2 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-04 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-04 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-04 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-05}}===
{{Deployment calendar event card
|when=2026-06-05 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-06-05 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-06-06}}===
{{Deployment calendar event card
|when=2026-06-06 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==Week of June 08==
==={{Deployment_day|date=2026-06-07}}===
{{Deployment calendar event card
|when=2026-06-07 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-06-08}}===
{{Deployment calendar event card
|when=2026-06-08 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-08 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-08 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-08 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-08 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-06-08 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-08 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-08 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-08 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|alexsanford|Alex}}, {{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-06-08 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-08 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.47.0-wmf.6</code>
}}
{{Deployment calendar event card
|when=2026-06-08 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.47.0-wmf.6</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-06-08 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-06-08 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-08 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-09}}===
{{Deployment calendar event card
|when=2026-06-09 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-09 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-09 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-09 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-09 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-06-09 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-09 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-06-09 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-09 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-09 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5->1.47.0-wmf.6|1.47.0-wmf.5|1.47.0-wmf.5}}
* group0 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-09 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-09 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-09 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-10}}===
{{Deployment calendar event card
|when=2026-06-10 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-10 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6|1.47.0-wmf.5}}
* group1 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-10 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-10 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-06-10 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-10 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-10 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-10 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-10 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6|1.47.0-wmf.5}}
* group1 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-10 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-10 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-10 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-10 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-10 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-11}}===
{{Deployment calendar event card
|when=2026-06-11 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-11 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6}}
* group2 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-11 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-11 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-11 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-11 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-11 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-06-11 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-11 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-11 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-11 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6}}
* group2 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-11 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-11 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-11 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-12}}===
{{Deployment calendar event card
|when=2026-06-12 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-06-12 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-06-13}}===
{{Deployment calendar event card
|when=2026-06-13 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
9kh6hheqy26fae7eng94xll7vl4axq9
2421406
2421405
2026-05-30T18:06:33Z
ScheduleDeploymentBot
37566
Add [[gerrit:1295536]] to Monday, June 01 UTC afternoon backport window
2421406
wikitext
text/x-wiki
{{Navigation MediaWiki deployment}}
This page tracks '''upcoming''' '''deployments''' of software to the [[:m:Special:SiteMatrix|Wikimedia Foundation servers]].
== Getting started ==
Ensure you joined the {{irc|wikimedia-operations}} IRC channel as all deployment-related communications happen there.
If you need help, contact [[:mw:Wikimedia Release Engineering Team|Release Engineering]] on IRC at {{irc|wikimedia-releng}}; and ping Tyler (<code>thcipriani</code>).
* '''MediaWiki is deployed weekly''' through the [[/Train|Deployment Train]]. Other services follow their own schedule.
* '''Times are pinned to San Francisco''', thus the UTC time changes in March and November per [[:en:Daylight saving time in the United States|DST]].
* '''Prefer regular [[Backport windows]]''' over adding new windows. To request deployment of a config change or backport, add your username and Gerrit URL to one of the backport windows on this page. You must be online in #wikimedia-operations on IRC during your deployment and install [[WikimediaDebug]] ahead of time. The #wikimedia-operations channel requires you to [[:m:IRC/Instructions#Register your nickname, identify, and enforce|register your nickname]] before you can join.
** You can use the '''backport scheduling tool''' to more easily edit this page: <div style="text-align: center; margin: 1em 0">{{Clickable button 2|:toollabs:schedule-deployment|Schedule a backport|class=mw-ui-progressive}}</div>
* Tasks that meet [[/Inclusion criteria|Inclusion criteria]] '''require their own windows''', which includes long-running tasks. '''Schedule more time''' than you think you need to account for delays and set backs, we recommend one hour for most tasks.
**To create or modify a recurring deploy window, send a patchset to [[:gitlab:repos/releng/release/-/blob/main/make-deployment-calendar/deployments-calendar.yaml|deployments-calendar.yaml file]] in <code>repos/releng/release.git</code>.
**To create an one-off window, simply edit this page accordingly
** '''Announce''' changes to the [[mail:ops|ops mailing list]] ahead of time if you anticipate or are uncertain about noticeable impacts to database load, HTTP caching, or the introduction of new cookies.
** '''Announce''' deployments of major features to the community via [[:m:Tech/News/Next|Tech News]] and/or via other [[:mw:Wikimedia_Product_Guidance/Communication_channels|Product communication channels]].
* '''Something went wrong?''' See [[Incident response]]. Is there a user-impacting problem? Communicate in the {{irc|wikimedia-operations}} IRC channel. If there is a Phabricator task, ensure [[:phab:tag/wikimedia-incident/|#Wikimedia-Incident]] is tagged, and consider setting the [[:mw:Phabricator/Project_management#Priority_levels|Unbreak Now]] priority.
__TOC__
{{anchor|Next Week|Near Term|Near term|Near-term}}{{clear}}
[[Category:Deployment]]
{{Note|content=Subscribe in Google Calendar via <code>wikimedia.org_rudis09ii2mm5fk4hgdjeh1u64@group.calendar.google.com</code>.<br>This may not include one-off windows. '''If there are differences, then the wiki page is canonical and correct'''.}}
==Week of June 01==
==={{Deployment_day|date=2026-05-31}}===
{{Deployment calendar event card
|when=2026-05-31 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-06-01}}===
{{Deployment calendar event card
|when=2026-06-01 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|WMDE-Fisch|WMDE-Fisch}}
{{deploy|type=1.47.0-wmf.4|gerrit=1294826|title=Update VE core submodule to master (9cf5524e7)|status=}} - {{phabricator|T424232}}
{{ircnick|atsukoito|atsukoito}}
{{deploy|type=config|gerrit=1294949|title=translate: adding separate read/write endpoints|status=}} - {{phabricator|T425377}}
{{ircnick|xxb|xxb}}
{{deploy|type=config|gerrit=1295531|title=Enable AbuseFilter block action on nlwiki|status=}} - {{phabricator|T427384}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-01 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-01 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|codenamenoreste|Codename Noreste}}
{{deploy|type=config|gerrit=1295536|title=swwiki: Enable the Visual Editor on the project namespace|status=}} - {{phabricator|T427117}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-01 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-01 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-06-01 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-01 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-01 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|sfaci|sfaci}}
{{deploy|type=config|gerrit=1285412|title=Remove `wgTestKitchenExperimentStreamNames`|status=}} - {{phabricator|T422358}}
{{ircnick|RoanKattouw|RoanKattouw}}
{{deploy|type=1.47.0-wmf.4|gerrit=1295504|title=passwordlessLogin: Don't immediately error out in unsupported browsers|status=}} - {{phabricator|T427562}}
{{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-01 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|alexsanford|Alex}}, {{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-06-01 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-01 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.47.0-wmf.5</code>
}}
{{Deployment calendar event card
|when=2026-06-01 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.47.0-wmf.5</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-06-01 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-06-01 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-01 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-02}}===
{{Deployment calendar event card
|when=2026-06-02 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-02 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-02 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-02 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-02 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-06-02 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-02 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-06-02 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-02 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-02 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.4->1.47.0-wmf.5|1.47.0-wmf.4|1.47.0-wmf.4}}
* group0 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-02 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-02 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-02 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-03}}===
{{Deployment calendar event card
|when=2026-06-03 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-03 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5|1.47.0-wmf.4}}
* group1 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-03 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-03 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-06-03 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-03 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-03 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-03 08:00 SF
|length=1
|window=Jenkins switchover/upgrade
|who={{ircnick|mutante|Dzahn}}, {{ircnick|hashar|AMusso}}
|what=Moving Jenkins to Java 21 host, upgrading Jenkins agents, '''Gerrit CI downtime'''.
}}
{{Deployment calendar event card
|when=2026-06-03 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-03 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5|1.47.0-wmf.4}}
* group1 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-03 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-03 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-03 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-03 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-03 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-04}}===
{{Deployment calendar event card
|when=2026-06-04 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-04 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5}}
* group2 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-04 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-04 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-04 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-04 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-04 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-06-04 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-04 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-04 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-04 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dancy|Ahmon}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5|1.47.0-wmf.5|1.47.0-wmf.4->1.47.0-wmf.5}}
* group2 to [[mw:MediaWiki_1.47/wmf.5|1.47.0-wmf.5]]
* '''Blockers: {{phabricator|T423914}}'''
}}
{{Deployment calendar event card
|when=2026-06-04 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-04 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-04 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-05}}===
{{Deployment calendar event card
|when=2026-06-05 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-06-05 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-06-06}}===
{{Deployment calendar event card
|when=2026-06-06 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==Week of June 08==
==={{Deployment_day|date=2026-06-07}}===
{{Deployment calendar event card
|when=2026-06-07 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
==={{Deployment_day|date=2026-06-08}}===
{{Deployment calendar event card
|when=2026-06-08 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-08 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-08 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-08 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-08 08:30 SF
|length=0.5
|window=Wikimedia Portals Update
|who={{ircnick|jan_drewniak|Jan Drewniak}}
|what=Weekly window for the portals page: https://www.wikipedia.org/
}}
{{Deployment calendar event card
|when=2026-06-08 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-08 10:00 SF
|length=0.5
|window=Wikidata Query Service weekly deploy
|who={{ircnick|ryankemper|Ryan}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-08 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-08 14:00 SF
|length=2
|window=Weekly Security deployment window
|who={{ircnick|alexsanford|Alex}}, {{ircnick|Reedy|Sam}}, {{ircnick|sbassett|Scott}}, {{ircnick|Maryum|Maryum}}, {{ircnick|manfredi|Manfredi}}
|what=Held deployment window for Security-team related deploys.
}}
{{Deployment calendar event card
|when=2026-06-08 16:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-08 19:00 SF
|length=1
|window=Automatic branching of MediaWiki, extensions, skins, and vendor – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Branch <code>wmf/1.47.0-wmf.6</code>
}}
{{Deployment calendar event card
|when=2026-06-08 20:00 SF
|length=1
|window=Automatic deployment of MediaWiki, extensions, skins, and vendor to testwikis only – see [[Heterogeneous deployment/Train deploys]]
|who=N/A
|what=Deploy <code>wmf/1.47.0-wmf.6</code> to testwikis
}}
{{Deployment calendar event card
|when=2026-06-08 21:00 SF
|length=1
|window=Automatic removal of all obsolete MediaWiki versions from the deployment and bare metal servers (except the most-recent obsolete version)
|who=N/A
|what=Runs <code>scap clean auto</code>
}}
{{Deployment calendar event card
|when=2026-06-08 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-08 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-09}}===
{{Deployment calendar event card
|when=2026-06-09 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-09 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-09 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-09 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-09 07:00 SF
|length=0.5
|window=Test Kitchen UI Deployment Window
|who=Experimentation Platform Team
|what=Deployment of Test Kitchen UI (fka MPIC)
}}
{{Deployment calendar event card
|when=2026-06-09 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-09 08:00 SF
|length=1
|window=SRE Collaboration Services office hours
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=Services including Gerrit, Phorge (Phabricator), GitLab
}}
{{Deployment calendar event card
|when=2026-06-09 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-09 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-09 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.5->1.47.0-wmf.6|1.47.0-wmf.5|1.47.0-wmf.5}}
* group0 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-09 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-09 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-09 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-10}}===
{{Deployment calendar event card
|when=2026-06-10 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-10 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6|1.47.0-wmf.5}}
* group1 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-10 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-10 04:00 SF
|length=1
|window=[[mw:Services|Services]] – [[Citoid]] / [[Zotero]]
|who=Marielle ({{ircnick|mvolz}})
|what=See [[mw:Citoid|Citoid]]
}}
{{Deployment calendar event card
|when=2026-06-10 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-10 07:00 SF
|length=1
|window=Wikifunctions Services UTC Afternoon
|who=Abstract Wikipedia team (Africa, Europe, Eastern Americas)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-10 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-10 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-10 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6|1.47.0-wmf.5}}
* group1 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-10 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-10 14:00 SF
|length=1
|window=Wikifunctions Services UTC Late
|who=Abstract Wikipedia team (North and South America)
|what=Wikifunctions back-end k8s services
}}
{{Deployment calendar event card
|when=2026-06-10 15:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-10 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-10 23:00 SF
|length=0.5
|window=Primary database switchover
|who={{ircnick|marostegui|Manuel Arostegui}}, {{ircnick|Amir1|Amir}}, {{ircnick|federico3|Federico Ceratto}}
|what=Held deployment window for database primary masters maintenance
}}
==={{Deployment_day|date=2026-06-11}}===
{{Deployment calendar event card
|when=2026-06-11 00:00 SF
|length=1
|window=[[Backport windows|UTC morning backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Amir1|Amir}}, {{ircnick|urbanecm|Martin}}, {{ircnick|awight|Adam}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-11 01:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version (secondary timeslot)
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6}}
* group2 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-11 03:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC mid-day)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-11 05:00 SF
|length=1
|window=Mobileapps/RESTBase/Wikifeeds
|who=Content Transform Team
|what=Content transform team node services (mobileapps/wikifeeds)
}}
{{Deployment calendar event card
|when=2026-06-11 06:00 SF
|length=1
|window=[[Backport windows|UTC afternoon backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|Lucas_WMDE|Lucas}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-11 07:30 SF
|length=0.5
|window=Test Kitchen Experiment Deployment Window
|who=Test Kitchen
|what=Automatic start/stop of active experiments and instruments managed by [[Test Kitchen]].
}}
{{Deployment calendar event card
|when=2026-06-11 08:00 SF
|length=1
|window=Train log triage
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=See [[Heterogeneous deployment/Train deploys#Breakage]]
}}
{{Deployment calendar event card
|when=2026-06-11 09:00 SF
|length=1
|window=[[Puppet request window]]<br/><small>'''(Max 6 patches)'''</small>
|who={{ircnick|jhathaway|JHathaway}}, {{ircnick|rzl|Reuven}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to Puppet change''
}}
{{Deployment calendar event card
|when=2026-06-11 10:00 SF
|length=1
|window=Cloud Services/Technical Documentation weekly deploy (Toolhub, Developer portal, Striker)
|who={{ircnick|bd808}}
|what=...
}}
{{Deployment calendar event card
|when=2026-06-11 10:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC late)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
{{Deployment calendar event card
|when=2026-06-11 11:00 SF
|length=2
|window=MediaWiki train - Utc-7+Utc-0 Version
|who={{ircnick|dduvall|Dan}}, {{ircnick|jnuche|Jaime}}
|what=[[mw:MediaWiki 1.47/Roadmap#Schedule for the deployments|1.47 schedule]]
{{DeployOneWeekMini|1.47.0-wmf.6|1.47.0-wmf.6|1.47.0-wmf.5->1.47.0-wmf.6}}
* group2 to [[mw:MediaWiki_1.47/wmf.6|1.47.0-wmf.6]]
* '''Blockers: {{phabricator|T423915}}'''
}}
{{Deployment calendar event card
|when=2026-06-11 13:00 SF
|length=1
|window=[[Backport windows|UTC late backport window]]<br/><small>'''Your patch may or may not be deployed at the sole discretion of the deployer'''</small>
|who={{ircnick|RoanKattouw|Roan}}, {{ircnick|urbanecm|Martin}}, {{ircnick|TheresNoTime|Sammy}}, {{ircnick|kindrobot|Stef}}, {{ircnick|cjming|Clare}}
|what={{ircnick|irc-nickname|Requesting Developer}}
* ''Gerrit link to backport or config change''
}}
{{Deployment calendar event card
|when=2026-06-11 14:00 SF
|length=1
|window=Readers deployment window
|who=Readers
|what=NOTE: often skipped, the reader teams do not typically check IRC so assume this is not being used if 5 minutes past the start
}}
{{Deployment calendar event card
|when=2026-06-11 23:00 SF
|length=1
|window=[[MediaWiki_On_Kubernetes#How_to_manage_changes_to_the_infrastructure|MediaWiki infrastructure]] (UTC early)
|who=SRE team
|what=MediaWiki-related infrastructure changes that need a kubernetes deployment.
}}
==={{Deployment_day|date=2026-06-12}}===
{{Deployment calendar event card
|when=2026-06-12 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
{{Deployment calendar event card
|when=2026-06-12 04:00 SF
|length=0.5
|window=GitLab version upgrades
|who={{ircnick|jelto|Jelto}}, {{ircnick|arnoldokoth|Arnold}}, {{ircnick|mutante|Daniel}}, {{ircnick|arnaudb|Arnaud}}
|what=GitLab version upgrades
}}
==={{Deployment_day|date=2026-06-13}}===
{{Deployment calendar event card
|when=2026-06-13 00:00 SF
|length=24
|window=No deploys all day! See [[Deployments/Emergencies]] if things are broken.
|who=
|what=No Deploys
}}
b0jt1zmszsj86gxnslb9wztztf7sty0
Obsolete:Analytics/Archive/EventLogging
110
4542
2421343
2412110
2026-05-30T13:52:37Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging]] to [[Obsolete:Analytics/Archive/EventLogging]]: pages are obsolete
2412110
wikitext
text/x-wiki
{{Notice|This documentation is outdated. See [[Event_Platform#Event_Platform_documentation_pages|Event Platform documentation]].}}
'''EventLogging''' ('''EL''' for short) is a platform for modelling, logging, and processing arbitrary analytic data. It consists of:
* [[mw:Extension:EventLogging|a MediaWiki extension]] that provides JavaScript and PHP APIs for logging events
* [https://github.com/wikimedia/eventlogging a back-end written in Python] which aggregates events, validates them, and streams them to analytics clients.
This documentation is about specific EventLogging instance that collects data on Wikimedia sites.
[[File:EventLoggingStag.jpg|alt=EventLogging architecture|left|frameless|500x500px]]
{{TOCright}}
{{Clear}}
== For users ==
=== Schemas ===
Current schemas can be viewed at https://schema.wikimedia.org/. See [[Event Platform/Schemas]] for how to change or create schemas.
Prior to 2021, schemas were stored and maintained through "Schema:" pages on Meta-Wiki. These older schemas are listed at https://meta.wikimedia.org/wiki/Research:Schemas. Each of these also has a talk page that lists the relevant owner/team, its status (active, inactive, in development), its purging strategy, and is the place for discussion and questions. See also [[mw:Extension:EventLogging/Guide#Creating_a_schema|Extension:EventLogging/Guide#Creating_a_schema]] and [[labsconsole:Analytics/Systems/EventLogging/Schema_Guidelines|Analytics/Systems/EventLogging/Schema_Guidelines]].
=== Send events ===
See [[mediawikiwiki:Extension:EventLogging/Programming|Extension:EventLogging/Programming]] for how to instrument your MediaWiki code.
==== Client-side events ====
Client-side events are logged using a [[w:en:web beacon|web beacon]] with project's hostname (e.g. <code>[https://en.wikipedia.org/ https://en.wikipedia.org]</code>), the path <code>beacon/event</code>, and query string containing all the event fields (with [[:en:Percent-encoding|percent-encoded]] punctuation). For example:
<nowiki>https://en.wikipedia.org/beacon/event?%7B%22event%22%3A%7B%22version%22%3A1%2C%22action%22%3A%22abort%22</nowiki>...
Decoding the punctuation, this looks like:
<nowiki>https://en.wikipedia.org/beacon/event</nowiki>?
{
"event": { "action": "abort", ... },
"schema": "Edit",
"revision": 1234,
"webHost": "en.wikipedia.org",
"wiki": "enwiki"
}
Because this data is sent through a URL, we can't use URLs that are [https://stackoverflow.com/questions/417142/what-is-the-maximum-length-of-a-url-in-different-browsers/417184#417184 longer than browsers can cope with]. Therefore, EventLogging [https://github.com/wikimedia/mediawiki-extensions-EventLogging/blob/master/modules/ext.eventLogging/core.js#L58 limits] ''unencoded'' client-side events to 2000 characters.
Note that the beacon URL you choose does not actually affect the data logged; for simplicity, both [https://github.com/wikimedia/wikipedia-ios/blob/93229915d710379b8390f48ceaa6bff8d288c4bd/WMF%20Framework/EventLoggingService.swift#L32-L35 the iOS app] and [https://github.com/wikimedia/apps-android-wikipedia/blob/bac1ecfd163dfc83207df62202df6ae55fb23915/app/src/main/java/org/wikipedia/analytics/EventLoggingService.java#L22 the Android app] log all their events to the meta.wikimedia.org beacon even when the events relate to other projects.
Note that anyone could send events to these endpoints, but in production only events whose webhost is a wikimedia one are processed. There are many clones of our sites running our code (like bad.wikipedia-withadds.com) that are, at this time, sending events to the existing beacon.
=== Accessing data ===
==== Privacy ====
Data stored by EventLogging for the [[m:Special:PrefixIndex/Schema:|various schemas]] has varying degrees of privacy, including personally identifiable information and sensitive information, hence access to it requires an [[m:NDA|NDA]]. Also, by default, EL data is only kept for 90 days, unless otherwise specified, see [[Analytics/Systems/EventLogging/Data retention]].
See [[Analytics/EventLogging/Data representations]] for an explanation on where the data lives and how to access it.
==== Access ====
See: [[Analytics/Data access#EventLogging data]] and [[Analytics/Data access#Production_access]].
==== Hadoop & Hive ====
Raw JSON data is imported into HDFS from Kafka, and then further [[Analytics/Systems/JsonRefine|refined]] into Parquet-backed Hive tables. These tables live in 2 [[Analytics/Systems/Cluster/Hive|Hive]] databases: <code>event</code> and <code>event_sanitized</code>, and are stored in HDFS at <code>hdfs:///wmf/data/event</code> and <code>hdfs:///wmf/data/event_sanitized</code>. <code>event</code> stores the original data during 90 days (data older than 90 days is automatically deleted). <code>event_sanitized</code> stores the sanitized data indefinitely. The sanitization process uses a whitelist that indicates which tables and fields can be stored indefinitely, see: [[Analytics/Systems/EventLogging/Data retention and auto-purging]]. You can access all this data through Hive, Spark, or other Hadoop methods.
Data from a given hourly period is only refined into Hive two hours after the end of the period, to allow for late arriving events.<ref>[https://github.com/wikimedia/puppet/blob/743ab6288e43c4bfb105106d0b3c3cbe8f0f9dd4/modules/profile/manifests/analytics/refinery/job/refine.pp#L61 wikimedia/puppet/modules/profile/manifests/analytics/refinery/job/refine.pp]</ref>
===== Notes on data in Hive =====
A UDF has been provided in Hive to convert the <code>dt</code> field into a MediaWiki timestamp ([[phab:T186155]]). It can be used to join to mediawiki-style timestamp strings as follows:
ADD JAR hdfs:///wmf/refinery/current/artifacts/refinery-hive.jar;
CREATE TEMPORARY FUNCTION GetMediawikiTimestamp AS 'org.wikimedia.analytics.refinery.hive.GetMediawikiTimestampUDF';
SELECT GetMediaWikiTimestamp('2019-02-20T12:34:56Z') AS timestamp;
OK
timestamp
20190220123456
'''NOTE:''' Not all EventLogging analytics schemas are 'refinable'. Some schemas specify invalid field names, e.g. with dots '.' in them, or have field type changes between different records. If this happens, it will not be possible to be store the data in a Hive table and as such it won't appear in the list of refined tables. If your schema has this problem, you should fix it. (Dashes '-' in field names are automatically converted to underscores '_' during the refine process before the data is being ingested into Hive, cf. [[phab:T216096#4955417]].)
'''NOTE:''' Hadoop and Hive (in the JVM) are strongly typed, whereas the source EventLogging JSON data is not. This can cause [[phab:T182000#3854135|problems]] when importing into Hive, as the refinement step needs to figure out what to do if it encounters type changes. TYPE CHANGES ARE NOT SUPPORTED. Please do not ever change the type of an EventLogging field. You may add new fields as you need and stop using old ones, but do not change types. Some type changes will be partially supported during the refinement stage. E.g. if the schema contains an integer, but future data contains a decimal number, the refinement step will log a warning, but still finish refinement. The record with the offending type changed field have all its fields set to NULL (not just the offending field).
===== Hive =====
EventLogging analytics data is imported into <code>event</code> and <code>event_sanitized</code> databases in Hive.
Note that the EventLogging schema fields are within the <code>event</code> column ([https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_struct.html struct]). You can access them using dot notation, e.g. <code>event.userID</code>.
Basic example:
<syntaxhighlight lang="sql">
SELECT
event.userID,
count(*) as cnt
FROM
event.MobileWikiAppEdit
WHERE
year = 2017 AND month = 11 AND day = 20 AND hour = 19
GROUP BY event.userID
ORDER BY cnt DESC
LIMIT 10;
...
event.userid cnt
NULL 1848
333333 87
222229 59
111113 29
111125 21
466534 17
433542 10
754324 7
121346 7
123452 6
</syntaxhighlight>
Cross-schema example:
<syntaxhighlight lang="sql">
SELECT
nav.event.origincountry,
srv.event.description,
PERCENTILE(nav.event.responsestart, 0.50) AS responsestart_p50,
PERCENTILE(nav.event.responsestart, 0.75) AS responsestart_p75,
COUNT(*) AS count
FROM event.navigationtiming AS nav
JOIN event.servertiming AS srv ON nav.event.pageviewtoken = srv.event.pageviewtoken
WHERE
nav.year = 2020 AND
srv.year = 2020 AND
nav.month = 1 AND
srv.month = 1 AND
nav.day = 28 AND
srv.day = 28 AND
nav.event.isoversample = false
GROUP BY nav.event.origincountry,srv.event.description
HAVING count > 1000;
</syntaxhighlight>
===== Errors for schemas =====
Errors are available on eventerror table on events database:
Sample select:
select * from eventerror where event.schema like 'MobileWikiApp%' and year=2018 and month=11 and day=1 limit 10;
===== Spark =====
Spark can access data directly through HDFS, or as SQL tables in Hive. Refer to the [https://spark.apache.org/docs/latest/ Spark documentation] for how to do so. Examples:
====== Spark 2 Scala SQL & Hive: ======
<syntaxhighlight lang="scala">
// spark2-shell
val query = """
SELECT
event.userID,
count(*) as cnt
FROM
event.MobileWikiAppEdit
WHERE
year = 2017 AND month = 11 AND day = 20 AND hour = 19
GROUP BY event.userID
ORDER BY cnt DESC
"""
val result = spark.sql(query)
result.limit(10).show()
...
+--------+----+
| userID| cnt|
+--------+----+
| null|1848|
| 333333| 87|
| 222229| 59|
| 111113| 29|
| 111125| 21|
| 466534| 17|
| 433542| 10|
| 754324| 7|
| 121346| 7|
| 123452| 6|
+--------+----+
</syntaxhighlight>
====== Spark 2 Python SQL & Hive: ======
<syntaxhighlight lang="python">
# pyspark2
query = """
SELECT
event.userID,
count(*) as cnt
FROM
event.MobileWikiAppEdit
WHERE
year = 2017 AND month = 11 AND day = 20 AND hour = 19
GROUP BY event.userID
ORDER BY cnt DESC
"""
result = spark.sql(query)
result.limit(10).show()
...
+--------+----+
| userID| cnt|
+--------+----+
| null|1848|
| 333333| 87|
| 222229| 59|
| 111113| 29|
| 111125| 21|
| 466534| 17|
| 433542| 10|
| 754324| 7|
| 121346| 7|
| 123452| 6|
+--------+----+
</syntaxhighlight>
====== Spark 2 R SQL & Hive: ======
<syntaxhighlight lang="r">
# spark2R
query <- "
SELECT
event.userID,
count(*) as cnt
FROM
event.MobileWikiAppEdit
WHERE
year = 2017 AND month = 11 AND day = 20 AND hour = 19
GROUP BY event.userID
ORDER BY cnt DESC
"
result <- collect(sql(query))
head(result,10)
...
userID cnt
1 NA 1848
2 333333 87
3 222229 59
4 111113 29
5 111125 21
6 466534 17
7 433542 10
8 754324 7
9 121346 7
10 123452 6
</syntaxhighlight>
=====Hadoop. Archived Data =====
In 2017, some big EventLogging tables were [[Analytics/Systems/EventLogging/Administration#Dumping data via sqoop from eventlogging to hdfs|archived from MariaDB to Hadoop]]. Tables were exported with sqoop into avro format files and tables were created according to the corresponding schema. Thus far we have the following tables archived in Hadoop, in the <code>archive</code> database:
mobilewebuiclicktracking_10742159_15423246
Edit_13457736_15423246
MobileWikiAppToCInteraction_10375484_15423246
MediaViewer_10867062_15423246
MobileWikiAppToCInteraction_10375484_15423246
pagecontentsavecomplete_5588433_15423246
PageContentSaveComplete_5588433
PageCreation_7481635
PageCreation_7481635_15423246
PageDeletion_7481655
PageDeletion_7481655_15423246
You can query these tables just like any other table in hive. A tip regarding dealing with binary types:
select * from Some_tbl where (cast(uuid as string) )='ed663031e61452018531f45b4b5502cb';
Caveat: This process does not preserve the data type for e.g. bigint or boolean fields. The archived Hive table will contain them as strings instead, which will need to be converted back (e.g. <code>CAST(field AS BIGINT)</code>).
=====Hadoop Raw Data =====
Raw EventLogging JSON data is imported hourly into Hadoop by [[Analytics/Systems/Cluster/Gobblin|Gobblin]]. It is unlikely that you will ever need to access this raw data directly. Instead, use the refined <code>event</code> Hive tables as described above.
Raw data is written to directories named after each schema in hourly partitions in HDFS. <tt>/mnt/hdfs/wmf/data/raw/eventlogging/eventlogging_<schema>/hourly/<year>/<month>/<day>/<hour></tt>. There are a myriad of ways to access this data, including Hive and Spark. Below are a few examples. There may be many (better!) ways to do this.
For backup purposes, we keep 90 days of events coming from the eventlogging-client-side topic in <code>/mnt/hdfs/wmf/data/raw/eventlogging_client_side/hourly/<year>/<month>/<day>/<hour></code>.
[[File:EventLogging on Kafka - Lightning Talk.pdf|page=9|thumb|Advantages of processing EL data in Hadoop (lightning talk slide)]]
Note that all EventLogging data in Hadoop is automatically purged after 90 days; the whitelist of fields to retain is not used, but this feature could be added in the future if there is sufficient demand.
====== Hive ======
Hive has a couple of built in functions for parsing JSON. Since EventLogging records are stored as JSON strings, you can access this data by creating a Hive table with a single string column and then parsing that string in your queries:
<syntaxhighlight lang="sql">
ADD JAR file:///usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar;
-- Make sure you don't create tables in the default Hive database.
USE otto;
-- Create a table with a single string field
CREATE EXTERNAL TABLE `CentralNoticeBannerHistory` (
`json_string` string
)
PARTITIONED BY (
year int,
month int,
day int,
hour int
)
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.SequenceFileInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'/wmf/data/raw/eventlogging/eventlogging_CentralNoticeBannerHistory';
-- Add a partition
ALTER TABLE CentralNoticeBannerHistory
ADD PARTITION (year=2015, month=9, day=17, hour=16)
LOCATION '/wmf/data/raw/eventlogging/eventlogging_CentralNoticeBannerHistory/hourly/2015/09/17/16';
-- Parse the single string field as JSON and select a nested key out of it
SELECT get_json_object(json_string, '$.event.l.b') as banner_name
FROM CentralNoticeBannerHistory
WHERE year=2015;
</syntaxhighlight>
====== Spark ======
Spark Python (<code>pyspark</code>):
<syntaxhighlight lang="python">
import json
data = sc.sequenceFile("/wmf/data/raw/eventlogging/eventlogging_CentralNoticeBannerHistory/hourly/2015/09/17/07")
records = data.map(lambda x: json.loads(x[1]))
records.map(lambda x: (x['event']['l'][0]['b'], 1)).countByKey()
Out[33]: defaultdict(<class 'int'>, {'WMES_General_Assembly': 5})
</syntaxhighlight>
MobileWikiAppFindInPage events with SparkSQL in Spark Python (<code>pyspark 1</code>):
<syntaxhighlight lang="python">
# Load the JSON string values out of the compressed sequence file.
# Note that this uses * globs to expand to all data in 2016.
data = sc.sequenceFile(
"/wmf/data/raw/eventlogging/eventlogging_MobileWikiAppFindInPage/hourly/2016/*/*/*"
).map(lambda x: x[1])
# parse the JSON strings into a DataFrame
json_data = sqlCtx.jsonRDD(data) # replace with sqlCtx.read.json(data) for pyspark 2
# Register this DataFrame as a temp table so we can use SparkSQL.
json_data.registerTempTable("MobileWikiAppFindInPage")
top_k_page_ids = sqlCtx.sql(
"""SELECT event.pageID, count(*) AS cnt
FROM MobileWikiAppFindInPage
GROUP BY event.pageID
ORDER BY cnt DESC
LIMIT 10"""
)
for r in top_k_page_ids.collect():
print "%s: %s" % (r.pageID, r.cnt)
</syntaxhighlight>
Edit events with SparkSQL in Spark scala (spark-shell):
<syntaxhighlight lang="scala">
// Load the JSON string values out of the compressed sequence file
// and parse them as a DataFrame.
val rawDataPath = "/wmf/data/raw/eventlogging/eventlogging_Edit/hourly/2015/10/21/16"
val edits = spark.read.json(
spark.createDataset[String](
spark.sparkContext.sequenceFile[Long, String](rawDataPath).map(_._2)
)
)
// Register this DataFrame as a temp table so we can use SparkSQL.
edits.registerTempTable("edits")
// SELECT top 10 edited wikis
val top_k_edits = sqlContext.sql(
"""SELECT wiki, count(*) AS cnt
FROM edits
GROUP BY wiki
ORDER BY cnt DESC
LIMIT 10"""
)
// Print them out
top_k_edits.foreach(println)
</syntaxhighlight>
==== Kafka ====
There are many Kafka tools with which you can read the EventLogging data streams. [https://github.com/edenhill/kafkacat kafkacat] is one that is installed on stat1007.
<syntaxhighlight lang="bash">
# Uses kafkacat CLI to print window ($1)
# seconds of data from $topic ($2)
function kafka_timed_subscribe {
timeout $1 kafkacat -C -b kafka-jumbo1001 -t $2
}
# Prints the top K most frequently
# occurring values from stdin.
function top_k {
sort |
uniq -c |
sort -nr |
head -n $1
}
while true; do
date; echo '------------------------------'
# Subscribe to eventlogging_Edit topic for 5 seconds
kafka_timed_subscribe 5 eventlogging_Edit |
# Filter for the "wiki" field
jq .wiki |
# Count the top 10 wikis that had the most edits
top_k 10
echo ''
done
</syntaxhighlight>
=== Publishing data ===
See [[Analytics/EventLogging/Publishing]] for how to proceed if you want to publish reports based on EventLogging data, or datasets that contain EventLogging data.
=== Verify received events ===
Logstash has eventlogging EventError events. You can view all of these at https://logstash.wikimedia.org/goto/bda91f37481ae4970ee21e11810d49d3
Validation errors are visible on application logs located at
/srv/log/eventlogging/systemd
In production they also end up in the kafka topic
eventlogging_EventError
There is also a Hive table named <code>event.eventerror</code>.
The processor is the one that handles validation, so, for example;
eventlogging_processor-client-side-<some>.log
will have an error like the following if events are invalid:
<syntaxhighlight lang="bash">
Unable to validate: ?{
"event": {
"pagename": "Recentchanges",
"namespace": null,
"invert": false,
"associated": false,
"hideminor": false,
"hidebots": true,
"hideanons": false,
"hideliu": false,
"hidepatrolled": false,
"hidemyself": false,
"hidecategorization": true,
"tagfilter": null
},
"schema": "ChangesListFilters",
"revision": 15876023,
"clientValidated": false,
"wiki": "nowikimedia",
"webHost": "no.wikimedia.org",
"userAgent": "Apple-PubSub/65.28"
}; cp1066.eqiad.wmnet 42402900 2016-09-26T07:01:42 -
</syntaxhighlight>
This happens if client code has a bug and is sending events that are not valid according to the schema, we normally try to identify the schema at fault and pas that info back to the devs so they can fix it. See a ticket of how do we deal with these errors: https://phabricator.wikimedia.org/T146674
As of [[phab:T205437|T205437]], validation error logs are also available in Logstash for up to 30 days, i.e. https://logstash.wikimedia.org/goto/4882115feb72bdcfa812ace67b02e5bb. A handy link to the associated Kibana search is available on a schema's talk page, provided that it's documented using [[metawiki:Template:SchemaDoc|the SchemaDoc template]].
Note well that access to Logstash requires a Wikimedia developer account with membership in a user group indicating that the user has signed an [[m:NDA|NDA]].
=== User agent sanitization ===
''Main article:'' [[Analytics/Systems/EventLogging/User agent sanitization]]
The <code>userAgent</code> field is sanitized immediately upon storage; the content is replaced with a parsed version in JSON format.
=== Data retention and purging ===
{{Main|Analytics/Systems/EventLogging/Data retention}}
By default, all EventLogging data is deleted after 90 days to comply with our [[m:data retention guidelines|data retention guidelines]].
However, individual properties within schemas can be whitelisted so that the data is retained indefinitely; generally, all columns can be whitelisted, except the <code>clientIp</code> and <code>userAgent</code> fields. This whitelist is maintained in the <code>analytics/refinery</code> repo as <code>static_data/eventlogging/whitelist.yaml</code>.
=== Retiring a schema ===
When you no longer want to collect a particular data stream, there are a few cleanup steps you should take:
* Remove the instrumentation code
* Mark the schema inactive by editing the [[meta:Template:SchemaDoc|SchemaDoc template]] on its talk page.
* Remove its entries from the [https://github.com/wikimedia/analytics-refinery/blob/master/static_data/eventlogging/whitelist.yaml whitelist] (so it's easy for others to review what's actively being retained).
* Request the deletion of any previously whitelisted data if it's no necessary
== Operational support ==
=== Tier 2 support ===
[[Analytics/Tier2]]
=== Outages ===
Any outages that affect EventLogging will be tracked on <b>[[Incident documentation]] </b>(also listed [[#Incidents|below]]) and announced to the lists [https://lists.wikimedia.org/mailman/listinfo/eventlogging-alerts eventlogging-alerts@lists.wikimedia.org] and [mailto:ops@lists.wikimedia.org ops@lists.wikimedia.org].
== Alarms ==
Alarms at this time come to the Analytics team. We are working on being able to claim alarms in [[icinga]].
== Contact ==
You can contact the analytics team at: [mailto:analytics@lists.wikimedia.org analytics@lists.wikimedia.org]
== For developers ==
=== Codebase ===
The EventLogging python codebase can be found at https://gerrit.wikimedia.org/r/#/admin/projects/eventlogging
=== Architecture ===
See [[Analytics/EventLogging/Architecture]] for EventLogging architecture.
=== Performance ===
On this page you'll find information about Event Logging performance, such as load tests and benchmarks:
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/Performance
=== Size limitation ===
There is a limitation of the size of individual EventLogging events due the underlying infrastructure (limited size of urls in Varnish's varnishncsa/ varnishlog, as well as Wikimedia UDP packets). For the purpose of size limitation, an "entry" is a <code>/beacon</code> request URL containing urlencoded JSON-stringified event data. Entries longer than 1014 bytes are truncated. When an entry is truncated, it will fail validation because of parsing (as the result is invalid JSON).
This should be taken into account when creating a schema. Large schemas should be avoided and schema fields with long keys and/or values, too. Consider splitting up a very large schema, or replacing long fields with shorter ones.
To aid with testing the length of schemas, EventLogging's dev-server logs a warning into the console for each event that exceeds the size limit.
=== Monitoring ===
You can use various tools to monitor operational metrics, read more in this dedicated page:
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/Monitoring
=== Testing ===
The Event Logging extension can be tested on vagrant easily and that is described on mediawiki.org at [[mediawikiwiki:Extension:EventLogging|Extension:EventLogging.]] The server side of EventLogging (consumer of events) does not have a vagrant setup for testing but can be tested in the Beta Cluster:
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/TestingOnBetaCluster
=== How do I ...? ===
Visit this EventLogging how to page. It contains some dev-ops tips and tricks for EventLogging like: deploying, troubleshooting, restarting, etc. Please, add here any step-by-step on EventLogging dev-ops tasks.
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/How_to
== Administration. On call ==
Here's a list of routine tasks to do when oncall for EventLogging.
https://wikitech.wikimedia.org/wiki/Analytics/EventLogging/Oncall
== Data Quality Issues ==
=== Changes and Known Problems with Dataset===
{| class="wikitable"
|-
! Date from !! Date until !! Task !! Details
|-
|2020-06-18T20:00:00Z
|2019-06-19T22:00:00Z
|[[phab:T249261|Task T249261]]
| While attempting the first migration of legacy EventLogging steams to EventGate, Otto misconfigured the EventLogging extension's <tt>$wgEventLoggingServiceUri</tt> for non group0 wikis, effectively causing SearchSatisfaction events to be disable on all non group0 wikis.
|
|-
|2019-09-23
|2019-09-29
|[[phab:T233718|Task T233718]]
| Many events emitted by MediaWiki are missing in Hive refined event database tables, including events from mediawiki_revision_create, mediawiki_page_create, etc. This was caused by a problem when importing data from Kafka via Camus, but at the time was only known to affect mediawiki_api_request and mediawiki_cirrussearch_request. Data for other mediawiki_* tables was not backfilled, and the raw data has since been deleted.
|
|-
|2017-11
|2017-11
|[[phab:T179625|Task T179625]]
|Canonical EventLogging data (parsed and validated and stored in Kafka) did not match EventCapsule schema. This was fixed, and data was transformed before insertion into MySQL for backwards compatibility. This helped standardize all event data so that it could be refined and made available in Hive.
|
|-
| 2017-07-10
| 2017-07-12
|{{Phabricator|T170486}}
| Some data was not inserted in MySQL, but was backfilled for all schemas but page-create. During the backfill, bot events were also accidentally backfilled, resulting in extra data during this time.
|
|-
| 2017-05-24
| onwards
|{{Phabricator|T67508}}
| Do not accept data from bots on eventlogging unless bot user agent matches "MediaWiki".
|
|-
| 2017-03-29
| onwards
|{{Phabricator|T153207}}
| Change userAgent field in event capsule
|
|-
|2019-03-19 (14 to 22 hours)
|
|{{Phabricator|T218831}}
| Eventlogging mysql consumer was restarting for several hours in which it was not able to enter any data on database
|
|-
|2019-04-01
|
||[[phab:T219842|Task: T219842]]
| Kafka Jumbo outage since 22:00 to midnite. Data loss on those hours
|
|-
|2019-09-12
|
|https://phabricator.wikimedia.org/T228557
|Third party domain data is not getting refined (so sites like w.upupming.site that run clones of our code do not send us their requests)
|
|}
=== Incidents ===
Here's a [[:Category:EventLogging/Incident documentation|list of all related incidents and their post-mortems]]. To add a new page to this generated list, use the "<code>EventLogging/Incident_documentation</code>" category.
For all the incidents (including ones not related to EventLogging) see: [[Incident documentation]].
=== Limits of the eventlogging replication script ===
The log database is replicated to the eventlogging slave databases via a custom script, called eventlogging_sync.sh (script stored in operations/puppet for the curious). While working on https://phabricator.wikimedia.org/T174815 we realized that the script was not able to replicate high volume events in real time, showing a lot of replication lag (even days in the worst case scenario). Please review the task for more info or contact the Analytics team in case you have more questions.
=== Ad blockers ===
Our client-side analytics instrumentation is subject to interference by any ad blocking software the user has installed. See, for example, [[phab:T240697|T240697]]/[[phab:T251464|T251464]], in which no-JS editor counts were skewed by unaccounted-for ad blockers. Ad blockers typically work by comparing outgoing requests to a list of disallowed URL domains, paths, or other patterns. For example, ad blockers using the popular [https://easylist.to/easylist/easyprivacy.txt EasyPrivacy] block list block requests from page scripts to paths matching <code>/beacon/event?</code> (affecting legacy EventLogging) as well as to the domain <code>intake-analytics.wikimedia.org</code> (affecting requests to the new event platform intake service).
The following ad blockers are known as of February 2021 to interfere with WMF analytics instrumentation when using default settings. (Note that most if not all ad blockers allow users to add block lists and custom rules that could result in WMF analytics requests being blocked.)
{| class="wikitable"
!Name
!Client platforms affected
!Analytics intake systems affected
!Notes
|-
|[[:en:UBlock Origin|uBlock Origin]]
|Web (desktop + mobile)
|EventLogging, MEP
|EasyPrivacy enabled by default
|-
|[[:en:Brave (web browser)|Brave (web browser)]]
|Web (desktop + mobile)
|MEP
|Blocks requests to <code>intake-analytics.wikimedia.org</code> when using standard (default) privacy settings
|}
== See also ==
* [[Analytics/EventLogging/Outages]]
* [[Analytics/EventLogging/New pipeline]]
* [[Analytics/EventLogging/Sanitization vs Aggregation]]
* "EventLogging on Kafka". October 2015 lightning talk: [[:File:EventLogging_on_Kafka_-_Lightning_Talk.pdf|slides]], [https://www.youtube.com/watch?v=yUQ5d192z3M video]
== Notes ==
<references />
[[Category:Services]]
[[Category:Data platform]]
[[Category:Data platform systems]]
bxrkaimip1mn1xo2n7zds7hrfkkaqze
Obsolete talk:Analytics/Archive/EventLogging
111
4543
2421345
2259970
2026-05-30T13:52:37Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging]] to [[Obsolete talk:Analytics/Archive/EventLogging]]: pages are obsolete
1755642
wikitext
text/x-wiki
{{Note|'''Note''': Deployment of event tracking started in August, 2012 and is currently ongoing.}}
== How to request access to EventLogging data ==
I don't see described or linked here how to get access to event logging data in the WMF environment. It may be useful to get that done. I have an email from Ori of June 2013, where he writes that an RT ticket should be filed requesting access to "stat1", but here [[stat1]] has an unknown status with very little information. In any case, I need access to the logging for [[meta:Schema:UniversalLanguageSelector]] and I've created RT ticket 5710 with the following request: "Language Engineering is enabling EventLogging for UniversalLanguageSelector
over the next two weeks. We'll be logging into the schema https://meta.wikimedia.org/wiki/Schema:UniversalLanguageSelector. Please provide me access to the machine/database where the logged events
end up." I hope this will lead to success. Will update status here, and document the process, if no one has done it before me. [[User:Siebrand|siebrand]] ([[User talk:Siebrand|talk]]) 09:14, 4 September 2013 (UTC)
sfuvqpuny1ijt22jdib8u2w77rj290p
Event tracking
0
4572
2421500
2266564
2026-05-31T09:23:02Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421500
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Talk:Event tracking
1
4573
2421517
2266578
2026-05-31T09:23:23Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421517
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Event logging
0
4817
2421498
2266562
2026-05-31T09:23:00Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421498
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Talk:Event logging
1
4818
2421516
2266577
2026-05-31T09:23:22Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421516
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Server Admin Log
0
7919
2421401
2421339
2026-05-30T16:21:03Z
Stashbot
7414
arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
2421401
wikitext
text/x-wiki
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== 2026-05-20 ==
* 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s)
* 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]]
* 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s)
* 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]]
* 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]])
* 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s)
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]]
* 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet
* 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet
* 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet
* 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet
* 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet
* 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet
* 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet
* 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet
* 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet
* 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs
* 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet
* 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet
* 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet
* 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet
* 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet
* 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet
* 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp
* 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet
* 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet
* 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
* 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet
* 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet
* 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet
* 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s)
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment
* 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet
* 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet
* 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]]
* 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet
* 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet
* 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
* 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
* 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
* 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet
* 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet
* 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet
* 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:16 dwisehaupt@dns1005: END - running authdns-update
* 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance
* 20:15 dwisehaupt@dns1005: START - running authdns-update
* 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet
* 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet
* 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet
* 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet
* 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet
* 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet
* 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet
* 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s)
* 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet
* 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet
* 19:27 ejegg@deploy1003: ejegg: Continuing with deployment
* 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]]
* 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet
* 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s)
* 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet
* 18:45 reedy@deploy1003: reedy: Continuing with deployment
* 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]]
* 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 18:38 dwisehaupt@dns1004: END - running authdns-update
* 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:36 dwisehaupt@dns1004: START - running authdns-update
* 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet
* 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
* 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
* 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet
* 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet
* 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet
* 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s)
* 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet
* 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet
* 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml
* 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet
* 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet
* 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
* 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet
* 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet
* 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
* 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
* 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
* 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image
* 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh
* 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
* 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
* 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet
* 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
* 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet
* 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet
* 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet
* 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet
* 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet
* 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw
* 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp
* 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed
* 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh
* 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s)
* 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment
* 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]]
* 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
* 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet
* 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet
* 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
* 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet
* 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet
* 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot
* 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie
* 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable.
* 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet
* 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet
* 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh
* 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 moritzm: installing rsync security updates
* 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
* 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts
* 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s)
* 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet
* 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002
* 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed
* 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet
* 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:43 Lucas_WMDE: UTC afternoon backport+config window done
* 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet
* 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s)
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:38 reedy@deploy1003: reedy: Continuing with deployment
* 13:38 moritzm: installing krb5 security updates
* 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056
* 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]]
* 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056
* 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
* 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:22 root@cumin1003: START - Cookbook sre.dns.netbox
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed
* 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183
* 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]]
* 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie
* 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97)
* 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie
* 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s)
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 13:11 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]]
* 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s)
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 12:57 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]]
* 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart
* 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie
* 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s)
* 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:22 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]]
* 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed
* 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s)
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]]
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed
* 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
* 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet
* 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet
* 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet
* 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet
* 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048
* 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet
* 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet
* 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet
* 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet
* 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet
* 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet
* 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed
* 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 slyngshede@dns1004: END - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 10:21 slyngshede@dns1004: START - running authdns-update
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet
* 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet
* 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet
* 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet
* 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot
* 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet
* 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]]
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet
* 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet
* 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed
* 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet
* 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet
* 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
* 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet
* 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad
* 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]]
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org
* 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389
* 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947
* 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet
* 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947
* 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie
* 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet
* 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet
* 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org
* 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org
* 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed
* 08:26 moritzm: installing Java 11 security updates
* 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie
* 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release)
* 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot
* 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot
* 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot
* 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot
* 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot
* 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
* 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot
* 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s)
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 07:19 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]]
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s)
* 07:11 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 07:09 moritzm: remove haveged
* 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]]
* 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet
* 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet
* 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet
* 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart
* 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1
* 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.*
* 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet
* 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.*
* 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s)
* 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]]
* 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s)
* 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]]
* 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
== 2026-05-19 ==
* 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72
* 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]]
* 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet
* 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet
* 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
* 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036
* 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036
* 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder
* 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s)
* 20:51 sbassett@deploy1003: sbassett: Continuing with deployment
* 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]]
* 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s)
* 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]]
* 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s)
* 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment
* 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]]
* 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s)
* 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 19:53 otto@deploy1003: otto: Continuing with deployment
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s)
* 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]]
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d]
* 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s)
* 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d]
* 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s)
* 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d]
* 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org
* 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]]
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d]
* 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s)
* 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d]
* 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart
* 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart
* 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]]
* 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed
* 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83
* 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet
* 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet
* 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s)
* 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]]
* 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s)
* 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart
* 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]]
* 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s)
* 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed
* 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie
* 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001
* 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
* 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]]
* 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s)
* 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
* 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s)
* 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment
* 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json
* 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json
* 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet
* 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie
* 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie
* 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet
* 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet
* 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet
* 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet
* 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet
* 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
* 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 13:37 Lucas_WMDE: UTC afternoon backport+config window done
* 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 13:32 cscott@deploy1003: cscott: Continuing with deployment
* 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]]
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s)
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]]
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:12 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]]
* 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie
* 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie
* 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie
* 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie
* 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie
* 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet
* 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet
* 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
* 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie
* 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie
* 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
* 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet
* 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie
* 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet
* 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie
* 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie
* 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet
* 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet
* 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet
* 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet
* 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet
* 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
* 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
* 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet
* 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet
* 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004
* 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001
* 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie
* 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie
* 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet
* 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie
* 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
* 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie
* 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie
* 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet
* 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie
* 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet
* 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet
* 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie
* 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet
* 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
* 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie
* 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
* 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie
* 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
* 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet
* 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie
* 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie
* 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie
* 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie
* 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed
* 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
* 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
* 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401
* 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401
* 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet
* 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
* 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed
* 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
* 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie
* 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s)
* 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
* 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet
* 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]]
* 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008
* 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s)
* 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie
* 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet
* 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet
* 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 08:40 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet
* 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]]
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]]
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]]
* 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 08:24 Emperor: reboot apus codfw frontends (May reboots)
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet
* 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover
* 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]]
* 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 07:57 Emperor: reboot apus eqiad frontends (May reboots)
* 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]]
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet
* 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
* 07:33 XioNoX: add gnmic 0.46.0 to reprepro
* 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s)
* 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover
* 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:14 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover
* 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]]
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
* 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json
* 06:54 moritzm: installing qemu security updates
* 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json
* 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]]
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json
* 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]]
* 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
* 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet
* 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json
* 06:19 fceratto@dns1005: END - running authdns-update
* 06:18 fceratto@dns1005: START - running authdns-update
* 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json
* 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json
* 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]]
* 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json
* 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]]
* 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s)
* 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s)
* 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]]
* 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s)
* 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment
* 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]]
== 2026-05-18 ==
* 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
* 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
* 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s)
* 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]]
* 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
* 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet
* 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet
* 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s)
* 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment
* 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]]
* 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
* 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
* 21:16 mutante: gerrit-replica.wikimedia.org back online
* 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends
* 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:48 jhathaway@dns1004: END - running authdns-update
* 18:46 jhathaway@dns1004: START - running authdns-update
* 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet
* 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
* 18:26 herron: rebooting alert1002
* 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet
* 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet
* 18:16 mutante: releases.wikimedia.org - rebooting backends
* 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
* 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]]
* 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
* 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
* 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:46 herron: rebooting alert2002
* 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet
* 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
* 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet
* 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet
* 17:37 mutante: stewards* - rebooting
* 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
* 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet
* 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet
* 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet
* 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet
* 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:14 mutante: doc.wikimedia.org - rebooting backends
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet
* 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams
* 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
* 17:11 mutante: etherpad - rebooting backends
* 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet
* 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 17:04 mutante: contint2002, phab2002 - rebooting
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:32 mutante: zuul[12]00[123] / zuul* - rebooting
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade
* 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:27 mutante: people.wikimedia.org backend - rebooting
* 16:22 mutante: contint1003 - rebooting
* 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet
* 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw
* 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]]
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s)
* 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
* 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
* 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s)
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover
* 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover
* 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover
* 14:13 mlitn@deploy1003: Rolling back deployment
* 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]])
* 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:08 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]]
* 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163
* 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s)
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment
* 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
* 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]]
* 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s)
* 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad]
* 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s)
* 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad]
* 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s)
* 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad]
* 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
* 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
* 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]]
* 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s)
* 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]]
* 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s)
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
* 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s)
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 13:00 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]]
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json
* 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch`
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json
* 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]]
* 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json
* 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]]
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
* 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
* 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye
* 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye
* 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
* 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 11:21 slyngshede@dns1004: END - running authdns-update
* 11:19 slyngshede@dns1004: START - running authdns-update
* 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org
* 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org
* 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org
* 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet
* 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:56 slyngshede@dns1004: END - running authdns-update
* 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye
* 10:54 slyngshede@dns1004: START - running authdns-update
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye
* 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
* 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
* 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json
* 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw
* 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json
* 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json
* 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json
* 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json
* 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]]
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json
* 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]]
* 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye
* 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]]
* 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:18 moritzm: installing Java 21 security updates
* 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]]
* 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts
* 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082
* 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s)
* 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 09:03 ayounsi@dns1004: END - running authdns-update
* 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s)
* 09:01 ayounsi@dns1004: START - running authdns-update
* 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet
* 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082
* 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye
* 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]]
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 08:12 moritzm: installing glibc bugfix updates from bookworm point release
* 07:46 moritzm: installing systemd bugfix updates from bookworm point release
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:35 moritzm: installing openssl bugfix updates from bookworm point release
* 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet
* 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json
* 06:59 moritzm: installing systemd bugfix updates from trixie point release
* 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts
* 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts
* 06:49 moritzm: installing glibc bugfix updates from trixie point release
* 06:44 moritzm: installing openssl bugfix updates from trixie point release
* 06:33 moritzm: installing Linux 6.12.88 on trixie hosts
* 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4
== 2026-05-15 ==
* 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s)
* 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment
* 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]]
* 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm
* 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts
* 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet
* 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065
* 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065
* 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]]
* 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye
* 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065
* 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]]
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064
* 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064
* 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json
* 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json
* 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json
* 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065
* 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064
* 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064
* 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
* 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)
* 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-14 ==
* 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s)
* 21:43 egardner@deploy1003: egardner: Continuing with deployment
* 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]]
* 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s)
* 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]]
* 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm
* 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment
* 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]]
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s)
* 20:46 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]]
* 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s)
* 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm
* 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm
* 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]]
* 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm
* 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s)
* 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment
* 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]]
* 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm
* 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286
* 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm
* 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm
* 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:10 cmooney@dns2005: END - running authdns-update
* 17:09 cmooney@dns2005: START - running authdns-update
* 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s)
* 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:12 bearloga@deploy1003: bearloga: Continuing with deployment
* 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]]
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm
* 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289
* 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s)
* 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285
* 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:29 phuedx@deploy1003: phuedx: Continuing with deployment
* 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]]
* 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json
* 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm
* 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json
* 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s)
* 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json
* 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]]
* 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json
* 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s)
* 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm
* 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:56 mfossati@deploy1003: mfossati: Continuing with deployment
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]]
* 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json
* 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s)
* 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:45 krinkle@deploy1003: krinkle: Continuing with deployment
* 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm
* 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]]
* 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s)
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned
* 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned
* 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm
* 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]]
* 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283
* 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283
* 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover
* 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282
* 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]]
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json
* 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json
* 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281
* 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]]
* 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]])
* 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280
* 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json
* 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]]
* 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458
* 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458
* 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json
* 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json
* 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json
* 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye
* 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json
* 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye
* 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye
* 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned
* 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned
* 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned
* 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye
* 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye
* 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye
* 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye
* 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye
* 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json
* 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]])
* 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3
* 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie
* 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie
* 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
== 2026-05-13 ==
* 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s)
* 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]]
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s)
* 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]]
* 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s)
* 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]]
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]]
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s)
* 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment
* 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]]
* 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:20 cmooney@dns2005: END - running authdns-update
* 18:19 cmooney@dns2005: START - running authdns-update
* 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure
* 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul
* 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
* 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
* 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]])
* 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:16 cmooney@dns2005: END - running authdns-update
* 15:15 cmooney@dns2005: START - running authdns-update
* 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie
* 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s)
* 14:37 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]]
* 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s)
* 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:15 jforrester@deploy1003: jforrester: Continuing with deployment
* 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]]
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:08 Lucas_WMDE: UTC afternoon backport+config window done
* 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}}
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment
* 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}}
* 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie
* 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply
* 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}}
* 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s)
* 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]]
* {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}}
* 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment
* {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}}
* 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}}
* 13:25 moritzm: installing openjdk-11 security updates
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s)
* 13:07 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw
* 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]]
* 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s)
* 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]]
* 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
* 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]])
* 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]]
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]]
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie
* 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed
* 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie
* 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]]
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]]
* 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts
* 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie
* 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie
* 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie
* 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie
* 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts
* 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]]
* 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie
* 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:10 moritzm: installing Apache security updates on Bullseye
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie
* 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye
* 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie
* 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json
* 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye
* 09:56 moritzm: installing distro-info-data updates from Bookworm point release
* 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]]
* 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]]
* 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye
* 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json
* 09:51 moritzm: installing ca-certificates update from Bookworm point release
* 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye
* 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s)
* 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:41 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]]
* 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:28 cmooney@dns2005: END - running authdns-update
* 09:27 cmooney@dns2005: START - running authdns-update
* 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet
* 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage
* 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye
* 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye
* 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 moritzm: installing dnsmasq security updates
* 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 08:38 cmooney@dns2005: END - running authdns-update
* 08:37 cmooney@dns2005: START - running authdns-update
* 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s)
* 08:20 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]]
* 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]]
* 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s)
* 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment
* 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]]
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s)
* 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment
* 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie
* 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay
* 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie
* 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie
* 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie
* 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie
* 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie
* 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm
* 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277
* 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm
* 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276
* 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm
* 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s)
* 01:28 zabe@deploy1003: zabe: Continuing with deployment
* 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]]
* 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274
* 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-05-12 ==
* 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s)
* 23:40 cscott@deploy1003: cscott: Continuing with deployment
* 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]]
* 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s)
* 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm
* 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]]
* 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s)
* 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 dwisehaupt@dns1004: END - running authdns-update
* 21:57 dwisehaupt@dns1004: START - running authdns-update
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]]
* 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273
* 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm
* 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273
* 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment
* 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]]
* 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s)
* 21:15 cscott@deploy1003: cscott: Continuing with deployment
* 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change
* 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]]
* 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm
* 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s)
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment
* 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]]
* 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s)
* 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:20 dbrant@deploy1003: dbrant: Continuing with deployment
* 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]]
* 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s)
* 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]]
* 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s)
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment
* 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]]
* 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)
* 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:52 otto@deploy1003: otto: Continuing with deployment
* 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]
* 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 16:25 moritzm: installing Exim security updates on lists/vrts hosts
* 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s)
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]]
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]]
* 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s)
* 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye
* 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye
* 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye
* 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001
* 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001
* 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 14:15 Lucas_WMDE: UTC afternoon backport+config window done
* 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s)
* 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271
* 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]]
* 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye
* 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s)
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271
* 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272
* 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272
* 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]]
* 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]]
* 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot
* 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot
* 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s)
* 13:09 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]]
* 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}}
* 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced
* {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}}
* 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s)
* 12:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]]
* 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s)
* 09:51 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]]
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie
* 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie
* 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s)
* 08:00 dcausse@deploy1003: dcausse: Rolling back deployment
* 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie
* 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s)
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie
* 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet
* 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]]
* 06:27 jayme@dns1004: END - running authdns-update
* 06:26 jayme@dns1004: START - running authdns-update
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s)
* 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]]
== 2026-05-11 ==
* 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s)
* 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]]
* 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s)
* 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]]
* 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s)
* 21:47 cjming@deploy1003: cjming: Continuing with deployment
* 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]]
* 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]]
* 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment
* 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]]
* 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s)
* 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]]
* 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s)
* 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm
* 19:58 zabe@deploy1003: zabe: Continuing with deployment
* 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]]
* 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts
* 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper
* 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269
* 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:16 dzahn@dns1005: END - running authdns-update
* 19:14 dzahn@dns1005: START - running authdns-update
* 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space
* 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]]
* 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json
* 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268
* 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json
* 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json
* 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts
* 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s)
* 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json
* 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json
* 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s)
* 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:23 zabe@deploy1003: zabe: Continuing with deployment
* 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]]
* 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s)
* 15:54 zabe@deploy1003: zabe: Continuing with deployment
* 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s)
* 15:42 zabe@deploy1003: zabe: Continuing with deployment
* 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]]
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017
* 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017
* 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 Lucas_WMDE: UTC afternoon backport+config window done
* 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18
* 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment
* {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}}
* 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]]
* {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}}
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment
* {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}}
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
* 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw
* 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}}
* 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]]
* 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s)
* 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
* 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie
* 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]]
* 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)
* 13:06 elukey: remove old discovery pki intermediate
* 13:03 otto@deploy1003: otto: Continuing with deployment
* 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]
* 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s)
* 12:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]]
* 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]])
* 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot
* 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts
* 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s)
* 11:21 jayme@deploy1003: Rolling back deployment
* 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]]
* 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]]
* 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts
* 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s)
* 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image
* 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:16 slyngs: Migrate of lvs2012 due to hardware issues
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s)
* 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]]
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]]
* 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]]
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json
* 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]]
* 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json
* 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 08:10 slyngshede@dns1004: END - running authdns-update
* 08:08 slyngshede@dns1004: START - running authdns-update
* 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors
* 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet
* 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org
* 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org
* 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-10 ==
* 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-09 ==
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm
* 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-08 ==
* 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267
* 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm
* 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266
* 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm
* 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265
* 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/
* 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health
* 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps
* 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad
* 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart
* 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json
* 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json
* 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json
* 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie
* 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org
* 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie
* 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie
* 06:11 moritzm: installing postorius security updates
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie
* 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024
* 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023
* 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox
== 2026-05-07 ==
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s)
* 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19]
* 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s)
* 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19]
* 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s)
* 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19]
* {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}}
* 21:23 cscott@deploy1003: cscott: Continuing with deployment
* 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t
* {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}}
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s)
* 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment
* 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]]
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki
* 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s)
* 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]]
* 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie
* 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022
* 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:06 cdanis@dns1005: END - running authdns-update
* 18:04 cdanis@dns1005: START - running authdns-update
* 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s)
* 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis
* 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 17:51 krinkle@deploy1003: krinkle: Continuing with deployment
* 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]]
* 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart
* 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 16:32 jynus: restarting backup1-* database primary hosts
* 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart
* 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart
* 16:14 sukhe@dns1004: END - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
* 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
* 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet
* 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs
* 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
* 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s)
* 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c]
* 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s)
* 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
* 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c]
* 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s)
* 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:32 slyngshede@dns1004: END - running authdns-update
* 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c]
* 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 14:30 slyngshede@dns1004: START - running authdns-update
* 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train
* 14:30 jmm@dns1004: END - running authdns-update
* 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 14:28 jmm@dns1004: START - running authdns-update
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox
* 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
* 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
* 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s)
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]]
* 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s)
* 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment
* 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]]
* 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 12:45 sukhe@dns1004: FAIL - running authdns-update
* 12:44 sukhe@dns1004: START - running authdns-update
* 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm
* 12:23 slyngshede@dns1004: FAIL - running authdns-update
* 12:21 slyngshede@dns1004: START - running authdns-update
* 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:12 slyngshede@dns1004: FAIL - running authdns-update
* 12:11 slyngshede@dns1004: START - running authdns-update
* 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie
* 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie
* 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage
* 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie
* 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie
* 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json
* 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie
* 11:11 moritzm: instaling modsecurity-apache security updates
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s)
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:54 root@cumin1003: START - Cookbook sre.dns.netbox
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json
* 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184
* 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie
* 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]]
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json
* 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage
* 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json
* 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie
* 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie
* 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie
* 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie
* 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie
* 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org
* 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]]
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie
* 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org
* 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie
* 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie
* 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie
* 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie
* 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org
* 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie
* 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json
* 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage
* 08:23 XioNoX: drmrs remove old v6 gateway IP
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s)
* 07:49 dcausse@deploy1003: dcausse: Continuing with deployment
* 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]]
* 07:32 moritzm: installing apache2 security updates
* 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie
* 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie
* 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie
* 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie
* 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json
* 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json
* 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]]
* 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]]
* 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json
* 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s)
* 01:09 zabe@deploy1003: zabe: Continuing with deployment
* 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]]
* 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s)
* 00:31 zabe@deploy1003: zabe: Continuing with deployment
* 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]]
== 2026-05-06 ==
* 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s)
* 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s)
* 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s)
* 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]]
* 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s)
* 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]]
* 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s)
* 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:10 cjming@deploy1003: cjming: Continuing with deployment
* 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]]
* 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s)
* 21:48 zabe@deploy1003: zabe: Continuing with deployment
* 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]]
* 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s)
* 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment
* 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]]
* 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie
* 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm
* 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:37 dzahn@dns1005: END - running authdns-update
* 18:35 dzahn@dns1005: START - running authdns-update
* 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1
* 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm
* 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work
* 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie
* 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
* 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm
* 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'"
* 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s)
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]]
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:26 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'"
* 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]]
* 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s)
* 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]]
* 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s)
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie
* 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]]
* 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:45 jgreen@dns1004: END - running authdns-update
* 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s)
* 13:44 jgreen@dns1004: START - running authdns-update
* 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors
* 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors
* 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]]
* 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie
* 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie
* 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie
* 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie
* 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie
* 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie
* 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie
* 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s)
* 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie
* 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]]
* 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:50 moritzm: installing openjdk-17 security updates
* 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie
* 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie
* 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie
* 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json
* 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json
* 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json
* 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json
* 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json
* 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update
* 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json
* 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json
* 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s)
* 08:59 zabe@deploy1003: zabe: Continuing with deployment
* 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]]
* 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json
* 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json
* 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:06 awight: EU morning deployment is done
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s)
* 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment
* 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
* 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]]
* 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s)
* 07:22 awight@deploy1003: awight, lilients: Continuing with deployment
* 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]]
* 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie
* 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet
* 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]]
* 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie
* 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie
* 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie
* 05:11 marostegui@dns1004: END - running authdns-update
* 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json
* 05:09 marostegui@dns1004: START - running authdns-update
* 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json
* 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json
* 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json
* 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s)
* 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]]
* 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s)
* 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001
* 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001
* 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]]
== 2026-05-05 ==
* 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s)
* 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]]
* 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s)
* 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]]
* 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s)
* 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]]
* 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s)
* 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]]
* 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s)
* 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]]
* 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s)
* 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment
* 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]]
* 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts
* 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s)
* 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s)
* 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment
* 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve
* 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]]
* 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s)
* 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie
* 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment
* 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]]
* 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002
* 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002
* 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox
* 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie
* 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts
* 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s)
* 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation
* 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s)
* 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]]
* 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s)
* 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw
* 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw
* 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad
* 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad
* 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003
* 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'"
* 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s)
* 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment
* 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb
* 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]]
* 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]]
* 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s)
* 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]]
* 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s)
* 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]]
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s)
* 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]]
* 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 15:39 dzahn@dns1005: END - running authdns-update
* 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore
* 15:37 dzahn@dns1005: START - running authdns-update
* 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s)
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json
* 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]]
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json
* 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s)
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]]
* 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s)
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json
* 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json
* 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]]
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json
* 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json
* 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad
* 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json
* 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet
* 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json
* 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw
* 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:03 Lucas_WMDE: UTC afternoon backport+config window done
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet
* 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json
* 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json
* 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling
* 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet
* 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s)
* 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json
* 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]]
* 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json
* 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling
* 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json
* 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json
* 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:30 Msz2001: UTC afternoon backport window done
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json
* 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet
* 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s)
* 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]]
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json
* 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json
* 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment
* 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]]
* 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s)
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json
* 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json
* 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s)
* 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]]
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json
* 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s)
* 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 12:42 moritzm: installing node-tar security updates
* 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]]
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json
* 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 12:36 moritzm: installing imagemagick security updates
* 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json
* 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json
* 12:04 moritzm: installing postgresql-13 security updates
* 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json
* 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s)
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json
* 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet
* 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]]
* 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s)
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json
* 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet
* 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet
* 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]]
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json
* 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json
* 11:10 moritzm: installing ca-certificates updates from bookworm point release
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie
* 11:07 moritzm: installing multipart bugfix updates from bookworm point release
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json
* 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie
* 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json
* 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json
* 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json
* 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json
* 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie
* 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json
* 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie
* 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json
* 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json
* 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie
* 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie
* 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json
* 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json
* 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json
* 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json
* 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie
* 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json
* 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json
* 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 08:50 moritzm: installing augeas security updates
* 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json
* 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json
* 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json
* 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json
* 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json
* 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]]
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie
* 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json
* 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]]
* 08:05 ayounsi@dns1004: END - running authdns-update
* 08:03 ayounsi@dns1004: START - running authdns-update
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json
* 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie
* 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json
* 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json
* 07:55 awight: EU morning deployment was fun
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json
* 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]]
* 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json
* 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]]
* 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]]
* 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]]
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie
* 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s)
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment
* 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]]
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie
* 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie
* 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie
* 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie
* 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s)
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s)
* 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]]
== 2026-05-04 ==
* 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]]
* 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s)
* 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment
* 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]]
* 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s)
* 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging
* 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s)
* 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]]
* 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s)
* 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment
* 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]]
* 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s)
* 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie
* 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment
* 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]]
* 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors
* 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting
* 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie
* 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s)
* 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]]
* 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s)
* 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]]
* 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s)
* 18:11 dancy@deploy1003: dancy: Rolling back deployment
* 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:09 dancy@deploy1003: Started scap sync-world: testing
* 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts
* 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s)
* 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s)
* 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]]
* 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json
* 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s)
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json
* 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment
* 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement
* 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]]
* 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json
* 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json
* 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json
* 15:10 papaul: ongoing switch refresh in ULSFO
* 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s)
* 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]]
* 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json
* 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts
* 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts
* 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json
* 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh
* 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh
* 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001
* 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json
* 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json
* 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s)
* 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable
* 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]]
* 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json
* 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s)
* 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment
* 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]]
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json
* 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json
* 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json
* 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json
* 13:13 moritzm: installing jaraco.context security updates
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json
* 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json
* 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json
* 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json
* 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json
* 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie
* 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json
* 10:48 moritzm: installing bash updates from trixie point release
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json
* 10:42 moritzm: installing postgresql-17 security updates
* 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json
* 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance
* 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json
* 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie
* 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json
* 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie
* 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json
* 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events
* 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events
* 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json
* 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json
* 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie
* 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s)
* 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync
* 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync
* 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts
* 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment
* 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie
* 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]]
* 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie
* 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie
* 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie
* 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic
* 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie
* 07:38 moritzm: installing Linux 6.12.85 on trixie hosts
* 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet
* 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie
* 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie
* 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie
* 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie
* 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie
* 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]]
* 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie
* 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie
* 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie
* 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-03 ==
* 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s)
* 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]]
* 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s)
* 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]]
== 2026-05-02 ==
* 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s)
* 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment
* 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]]
* 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s)
* 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment
* 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]]
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie
* 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie
* 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie
* 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie
* 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie
* 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s)
* 11:57 samtar@deploy1003: samtar: Continuing with deployment
* 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]]
* 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s)
* 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie
* 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie
* 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
== 2026-05-01 ==
* 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie
* 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie
* 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373
* 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372
* 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357
* 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie
* 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s)
* 20:02 krinkle@deploy1003: krinkle: Continuing with deployment
* 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]]
* 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s)
* 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]]
* 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts
* 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s)
* 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox
* 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie
* 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003
* 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie
* 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts
* 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s)
* 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
* 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts
* 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s)
* 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie
* 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 13:24 _Gerges: WikiMonitor setup
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s)
* 09:53 samtar@deploy1003: samtar: Continuing with deployment
* 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]]
* 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s)
* 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]]
* 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s)
* 00:13 zabe@deploy1003: zabe: Continuing with deployment
* 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
4spjxcbdpobkms6b4fuagm99v5o8tx7
2421402
2421401
2026-05-30T16:21:06Z
Stashbot
7414
arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
2421402
wikitext
text/x-wiki
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== 2026-05-20 ==
* 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s)
* 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]]
* 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s)
* 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]]
* 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]])
* 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s)
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]]
* 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet
* 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet
* 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet
* 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet
* 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet
* 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet
* 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet
* 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet
* 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet
* 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs
* 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet
* 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet
* 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet
* 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet
* 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet
* 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet
* 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp
* 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet
* 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet
* 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
* 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet
* 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet
* 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet
* 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s)
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment
* 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet
* 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet
* 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]]
* 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet
* 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet
* 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
* 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
* 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
* 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet
* 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet
* 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet
* 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:16 dwisehaupt@dns1005: END - running authdns-update
* 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance
* 20:15 dwisehaupt@dns1005: START - running authdns-update
* 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet
* 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet
* 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet
* 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet
* 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet
* 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet
* 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet
* 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s)
* 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet
* 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet
* 19:27 ejegg@deploy1003: ejegg: Continuing with deployment
* 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]]
* 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet
* 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s)
* 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet
* 18:45 reedy@deploy1003: reedy: Continuing with deployment
* 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]]
* 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 18:38 dwisehaupt@dns1004: END - running authdns-update
* 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:36 dwisehaupt@dns1004: START - running authdns-update
* 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet
* 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
* 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
* 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet
* 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet
* 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet
* 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s)
* 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet
* 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet
* 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml
* 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet
* 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet
* 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
* 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet
* 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet
* 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
* 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
* 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
* 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image
* 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh
* 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
* 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
* 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet
* 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
* 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet
* 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet
* 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet
* 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet
* 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet
* 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw
* 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp
* 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed
* 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh
* 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s)
* 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment
* 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]]
* 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
* 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet
* 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet
* 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
* 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet
* 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet
* 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot
* 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie
* 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable.
* 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet
* 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet
* 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh
* 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 moritzm: installing rsync security updates
* 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
* 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts
* 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s)
* 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet
* 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002
* 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed
* 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet
* 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:43 Lucas_WMDE: UTC afternoon backport+config window done
* 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet
* 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s)
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:38 reedy@deploy1003: reedy: Continuing with deployment
* 13:38 moritzm: installing krb5 security updates
* 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056
* 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]]
* 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056
* 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
* 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:22 root@cumin1003: START - Cookbook sre.dns.netbox
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed
* 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183
* 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]]
* 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie
* 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97)
* 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie
* 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s)
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 13:11 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]]
* 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s)
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 12:57 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]]
* 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart
* 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie
* 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s)
* 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:22 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]]
* 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed
* 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s)
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]]
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed
* 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
* 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet
* 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet
* 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet
* 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet
* 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048
* 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet
* 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet
* 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet
* 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet
* 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet
* 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet
* 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed
* 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 slyngshede@dns1004: END - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 10:21 slyngshede@dns1004: START - running authdns-update
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet
* 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet
* 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet
* 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet
* 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot
* 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet
* 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]]
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet
* 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet
* 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed
* 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet
* 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet
* 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
* 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet
* 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad
* 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]]
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org
* 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389
* 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947
* 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet
* 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947
* 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie
* 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet
* 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet
* 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org
* 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org
* 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed
* 08:26 moritzm: installing Java 11 security updates
* 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie
* 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release)
* 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot
* 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot
* 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot
* 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot
* 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot
* 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
* 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot
* 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s)
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 07:19 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]]
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s)
* 07:11 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 07:09 moritzm: remove haveged
* 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]]
* 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet
* 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet
* 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet
* 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart
* 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1
* 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.*
* 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet
* 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.*
* 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s)
* 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]]
* 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s)
* 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]]
* 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
== 2026-05-19 ==
* 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72
* 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]]
* 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet
* 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet
* 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
* 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036
* 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036
* 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder
* 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s)
* 20:51 sbassett@deploy1003: sbassett: Continuing with deployment
* 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]]
* 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s)
* 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]]
* 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s)
* 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment
* 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]]
* 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s)
* 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 19:53 otto@deploy1003: otto: Continuing with deployment
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s)
* 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]]
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d]
* 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s)
* 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d]
* 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s)
* 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d]
* 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org
* 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]]
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d]
* 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s)
* 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d]
* 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart
* 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart
* 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]]
* 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed
* 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83
* 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet
* 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet
* 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s)
* 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]]
* 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s)
* 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart
* 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]]
* 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s)
* 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed
* 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie
* 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001
* 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
* 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]]
* 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s)
* 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
* 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s)
* 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment
* 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json
* 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json
* 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet
* 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie
* 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie
* 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet
* 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet
* 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet
* 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet
* 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet
* 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
* 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 13:37 Lucas_WMDE: UTC afternoon backport+config window done
* 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 13:32 cscott@deploy1003: cscott: Continuing with deployment
* 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]]
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s)
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]]
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:12 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]]
* 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie
* 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie
* 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie
* 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie
* 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie
* 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet
* 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet
* 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
* 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie
* 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie
* 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
* 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet
* 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie
* 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet
* 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie
* 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie
* 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet
* 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet
* 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet
* 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet
* 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet
* 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
* 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
* 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet
* 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet
* 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004
* 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001
* 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie
* 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie
* 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet
* 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie
* 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
* 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie
* 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie
* 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet
* 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie
* 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet
* 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet
* 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie
* 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet
* 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
* 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie
* 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
* 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie
* 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
* 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet
* 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie
* 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie
* 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie
* 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie
* 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed
* 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
* 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
* 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401
* 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401
* 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet
* 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
* 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed
* 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
* 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie
* 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s)
* 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
* 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet
* 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]]
* 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008
* 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s)
* 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie
* 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet
* 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet
* 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 08:40 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet
* 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]]
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]]
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]]
* 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 08:24 Emperor: reboot apus codfw frontends (May reboots)
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet
* 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover
* 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]]
* 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 07:57 Emperor: reboot apus eqiad frontends (May reboots)
* 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]]
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet
* 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
* 07:33 XioNoX: add gnmic 0.46.0 to reprepro
* 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s)
* 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover
* 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:14 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover
* 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]]
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
* 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json
* 06:54 moritzm: installing qemu security updates
* 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json
* 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]]
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json
* 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]]
* 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
* 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet
* 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json
* 06:19 fceratto@dns1005: END - running authdns-update
* 06:18 fceratto@dns1005: START - running authdns-update
* 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json
* 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json
* 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]]
* 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json
* 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]]
* 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s)
* 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s)
* 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]]
* 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s)
* 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment
* 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]]
== 2026-05-18 ==
* 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
* 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
* 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s)
* 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]]
* 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
* 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet
* 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet
* 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s)
* 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment
* 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]]
* 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
* 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
* 21:16 mutante: gerrit-replica.wikimedia.org back online
* 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends
* 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:48 jhathaway@dns1004: END - running authdns-update
* 18:46 jhathaway@dns1004: START - running authdns-update
* 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet
* 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
* 18:26 herron: rebooting alert1002
* 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet
* 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet
* 18:16 mutante: releases.wikimedia.org - rebooting backends
* 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
* 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]]
* 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
* 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
* 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:46 herron: rebooting alert2002
* 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet
* 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
* 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet
* 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet
* 17:37 mutante: stewards* - rebooting
* 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
* 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet
* 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet
* 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet
* 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet
* 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:14 mutante: doc.wikimedia.org - rebooting backends
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet
* 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams
* 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
* 17:11 mutante: etherpad - rebooting backends
* 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet
* 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 17:04 mutante: contint2002, phab2002 - rebooting
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:32 mutante: zuul[12]00[123] / zuul* - rebooting
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade
* 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:27 mutante: people.wikimedia.org backend - rebooting
* 16:22 mutante: contint1003 - rebooting
* 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet
* 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw
* 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]]
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s)
* 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
* 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
* 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s)
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover
* 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover
* 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover
* 14:13 mlitn@deploy1003: Rolling back deployment
* 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]])
* 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:08 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]]
* 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163
* 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s)
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment
* 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
* 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]]
* 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s)
* 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad]
* 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s)
* 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad]
* 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s)
* 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad]
* 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
* 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
* 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]]
* 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s)
* 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]]
* 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s)
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
* 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s)
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 13:00 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]]
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json
* 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch`
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json
* 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]]
* 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json
* 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]]
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
* 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
* 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye
* 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye
* 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
* 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 11:21 slyngshede@dns1004: END - running authdns-update
* 11:19 slyngshede@dns1004: START - running authdns-update
* 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org
* 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org
* 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org
* 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet
* 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:56 slyngshede@dns1004: END - running authdns-update
* 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye
* 10:54 slyngshede@dns1004: START - running authdns-update
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye
* 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
* 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
* 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json
* 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw
* 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json
* 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json
* 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json
* 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json
* 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]]
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json
* 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]]
* 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye
* 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]]
* 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:18 moritzm: installing Java 21 security updates
* 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]]
* 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts
* 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082
* 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s)
* 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 09:03 ayounsi@dns1004: END - running authdns-update
* 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s)
* 09:01 ayounsi@dns1004: START - running authdns-update
* 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet
* 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082
* 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye
* 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]]
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 08:12 moritzm: installing glibc bugfix updates from bookworm point release
* 07:46 moritzm: installing systemd bugfix updates from bookworm point release
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:35 moritzm: installing openssl bugfix updates from bookworm point release
* 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet
* 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json
* 06:59 moritzm: installing systemd bugfix updates from trixie point release
* 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts
* 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts
* 06:49 moritzm: installing glibc bugfix updates from trixie point release
* 06:44 moritzm: installing openssl bugfix updates from trixie point release
* 06:33 moritzm: installing Linux 6.12.88 on trixie hosts
* 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4
== 2026-05-15 ==
* 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s)
* 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment
* 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]]
* 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm
* 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts
* 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet
* 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065
* 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065
* 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]]
* 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye
* 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065
* 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]]
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064
* 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064
* 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json
* 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json
* 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json
* 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065
* 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064
* 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064
* 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
* 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)
* 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-14 ==
* 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s)
* 21:43 egardner@deploy1003: egardner: Continuing with deployment
* 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]]
* 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s)
* 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]]
* 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm
* 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment
* 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]]
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s)
* 20:46 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]]
* 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s)
* 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm
* 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm
* 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]]
* 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm
* 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s)
* 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment
* 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]]
* 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm
* 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286
* 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm
* 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm
* 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:10 cmooney@dns2005: END - running authdns-update
* 17:09 cmooney@dns2005: START - running authdns-update
* 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s)
* 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:12 bearloga@deploy1003: bearloga: Continuing with deployment
* 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]]
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm
* 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289
* 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s)
* 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285
* 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:29 phuedx@deploy1003: phuedx: Continuing with deployment
* 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]]
* 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json
* 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm
* 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json
* 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s)
* 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json
* 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]]
* 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json
* 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s)
* 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm
* 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:56 mfossati@deploy1003: mfossati: Continuing with deployment
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]]
* 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json
* 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s)
* 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:45 krinkle@deploy1003: krinkle: Continuing with deployment
* 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm
* 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]]
* 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s)
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned
* 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned
* 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm
* 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]]
* 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283
* 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283
* 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover
* 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282
* 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]]
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json
* 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json
* 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281
* 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]]
* 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]])
* 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280
* 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json
* 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]]
* 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458
* 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458
* 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json
* 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json
* 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json
* 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye
* 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json
* 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye
* 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye
* 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned
* 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned
* 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned
* 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye
* 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye
* 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye
* 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye
* 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye
* 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json
* 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]])
* 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3
* 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie
* 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie
* 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
== 2026-05-13 ==
* 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s)
* 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]]
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s)
* 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]]
* 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s)
* 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]]
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]]
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s)
* 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment
* 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]]
* 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:20 cmooney@dns2005: END - running authdns-update
* 18:19 cmooney@dns2005: START - running authdns-update
* 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure
* 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul
* 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
* 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
* 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]])
* 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:16 cmooney@dns2005: END - running authdns-update
* 15:15 cmooney@dns2005: START - running authdns-update
* 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie
* 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s)
* 14:37 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]]
* 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s)
* 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:15 jforrester@deploy1003: jforrester: Continuing with deployment
* 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]]
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:08 Lucas_WMDE: UTC afternoon backport+config window done
* 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}}
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment
* 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}}
* 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie
* 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply
* 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}}
* 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s)
* 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]]
* {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}}
* 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment
* {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}}
* 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}}
* 13:25 moritzm: installing openjdk-11 security updates
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s)
* 13:07 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw
* 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]]
* 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s)
* 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]]
* 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
* 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]])
* 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]]
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]]
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie
* 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed
* 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie
* 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]]
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]]
* 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts
* 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie
* 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie
* 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie
* 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie
* 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts
* 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]]
* 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie
* 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:10 moritzm: installing Apache security updates on Bullseye
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie
* 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye
* 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie
* 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json
* 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye
* 09:56 moritzm: installing distro-info-data updates from Bookworm point release
* 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]]
* 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]]
* 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye
* 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json
* 09:51 moritzm: installing ca-certificates update from Bookworm point release
* 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye
* 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s)
* 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:41 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]]
* 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:28 cmooney@dns2005: END - running authdns-update
* 09:27 cmooney@dns2005: START - running authdns-update
* 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet
* 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage
* 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye
* 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye
* 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 moritzm: installing dnsmasq security updates
* 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 08:38 cmooney@dns2005: END - running authdns-update
* 08:37 cmooney@dns2005: START - running authdns-update
* 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s)
* 08:20 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]]
* 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]]
* 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s)
* 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment
* 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]]
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s)
* 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment
* 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie
* 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay
* 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie
* 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie
* 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie
* 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie
* 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie
* 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm
* 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277
* 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm
* 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276
* 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm
* 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s)
* 01:28 zabe@deploy1003: zabe: Continuing with deployment
* 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]]
* 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274
* 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-05-12 ==
* 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s)
* 23:40 cscott@deploy1003: cscott: Continuing with deployment
* 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]]
* 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s)
* 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm
* 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]]
* 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s)
* 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 dwisehaupt@dns1004: END - running authdns-update
* 21:57 dwisehaupt@dns1004: START - running authdns-update
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]]
* 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273
* 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm
* 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273
* 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment
* 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]]
* 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s)
* 21:15 cscott@deploy1003: cscott: Continuing with deployment
* 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change
* 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]]
* 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm
* 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s)
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment
* 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]]
* 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s)
* 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:20 dbrant@deploy1003: dbrant: Continuing with deployment
* 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]]
* 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s)
* 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]]
* 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s)
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment
* 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]]
* 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)
* 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:52 otto@deploy1003: otto: Continuing with deployment
* 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]
* 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 16:25 moritzm: installing Exim security updates on lists/vrts hosts
* 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s)
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]]
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]]
* 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s)
* 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye
* 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye
* 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye
* 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001
* 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001
* 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 14:15 Lucas_WMDE: UTC afternoon backport+config window done
* 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s)
* 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271
* 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]]
* 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye
* 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s)
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271
* 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272
* 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272
* 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]]
* 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]]
* 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot
* 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot
* 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s)
* 13:09 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]]
* 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}}
* 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced
* {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}}
* 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s)
* 12:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]]
* 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s)
* 09:51 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]]
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie
* 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie
* 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s)
* 08:00 dcausse@deploy1003: dcausse: Rolling back deployment
* 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie
* 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s)
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie
* 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet
* 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]]
* 06:27 jayme@dns1004: END - running authdns-update
* 06:26 jayme@dns1004: START - running authdns-update
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s)
* 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]]
== 2026-05-11 ==
* 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s)
* 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]]
* 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s)
* 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]]
* 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s)
* 21:47 cjming@deploy1003: cjming: Continuing with deployment
* 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]]
* 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]]
* 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment
* 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]]
* 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s)
* 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]]
* 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s)
* 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm
* 19:58 zabe@deploy1003: zabe: Continuing with deployment
* 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]]
* 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts
* 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper
* 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269
* 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:16 dzahn@dns1005: END - running authdns-update
* 19:14 dzahn@dns1005: START - running authdns-update
* 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space
* 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]]
* 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json
* 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268
* 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json
* 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json
* 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts
* 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s)
* 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json
* 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json
* 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s)
* 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:23 zabe@deploy1003: zabe: Continuing with deployment
* 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]]
* 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s)
* 15:54 zabe@deploy1003: zabe: Continuing with deployment
* 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s)
* 15:42 zabe@deploy1003: zabe: Continuing with deployment
* 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]]
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017
* 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017
* 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 Lucas_WMDE: UTC afternoon backport+config window done
* 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18
* 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment
* {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}}
* 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]]
* {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}}
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment
* {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}}
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
* 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw
* 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}}
* 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]]
* 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s)
* 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
* 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie
* 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]]
* 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)
* 13:06 elukey: remove old discovery pki intermediate
* 13:03 otto@deploy1003: otto: Continuing with deployment
* 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]
* 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s)
* 12:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]]
* 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]])
* 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot
* 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts
* 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s)
* 11:21 jayme@deploy1003: Rolling back deployment
* 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]]
* 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]]
* 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts
* 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s)
* 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image
* 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:16 slyngs: Migrate of lvs2012 due to hardware issues
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s)
* 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]]
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]]
* 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]]
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json
* 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]]
* 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json
* 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 08:10 slyngshede@dns1004: END - running authdns-update
* 08:08 slyngshede@dns1004: START - running authdns-update
* 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors
* 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet
* 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org
* 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org
* 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-10 ==
* 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-09 ==
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm
* 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-08 ==
* 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267
* 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm
* 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266
* 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm
* 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265
* 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/
* 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health
* 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps
* 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad
* 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart
* 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json
* 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json
* 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json
* 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie
* 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org
* 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie
* 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie
* 06:11 moritzm: installing postorius security updates
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie
* 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024
* 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023
* 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox
== 2026-05-07 ==
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s)
* 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19]
* 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s)
* 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19]
* 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s)
* 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19]
* {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}}
* 21:23 cscott@deploy1003: cscott: Continuing with deployment
* 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t
* {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}}
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s)
* 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment
* 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]]
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki
* 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s)
* 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]]
* 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie
* 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022
* 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:06 cdanis@dns1005: END - running authdns-update
* 18:04 cdanis@dns1005: START - running authdns-update
* 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s)
* 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis
* 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 17:51 krinkle@deploy1003: krinkle: Continuing with deployment
* 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]]
* 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart
* 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 16:32 jynus: restarting backup1-* database primary hosts
* 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart
* 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart
* 16:14 sukhe@dns1004: END - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
* 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
* 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet
* 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs
* 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
* 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s)
* 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c]
* 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s)
* 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
* 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c]
* 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s)
* 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:32 slyngshede@dns1004: END - running authdns-update
* 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c]
* 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 14:30 slyngshede@dns1004: START - running authdns-update
* 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train
* 14:30 jmm@dns1004: END - running authdns-update
* 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 14:28 jmm@dns1004: START - running authdns-update
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox
* 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
* 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
* 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s)
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]]
* 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s)
* 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment
* 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]]
* 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 12:45 sukhe@dns1004: FAIL - running authdns-update
* 12:44 sukhe@dns1004: START - running authdns-update
* 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm
* 12:23 slyngshede@dns1004: FAIL - running authdns-update
* 12:21 slyngshede@dns1004: START - running authdns-update
* 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:12 slyngshede@dns1004: FAIL - running authdns-update
* 12:11 slyngshede@dns1004: START - running authdns-update
* 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie
* 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie
* 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage
* 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie
* 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie
* 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json
* 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie
* 11:11 moritzm: instaling modsecurity-apache security updates
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s)
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:54 root@cumin1003: START - Cookbook sre.dns.netbox
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json
* 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184
* 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie
* 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]]
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json
* 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage
* 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json
* 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie
* 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie
* 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie
* 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie
* 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie
* 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org
* 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]]
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie
* 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org
* 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie
* 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie
* 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie
* 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie
* 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org
* 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie
* 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json
* 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage
* 08:23 XioNoX: drmrs remove old v6 gateway IP
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s)
* 07:49 dcausse@deploy1003: dcausse: Continuing with deployment
* 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]]
* 07:32 moritzm: installing apache2 security updates
* 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie
* 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie
* 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie
* 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie
* 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json
* 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json
* 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]]
* 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]]
* 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json
* 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s)
* 01:09 zabe@deploy1003: zabe: Continuing with deployment
* 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]]
* 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s)
* 00:31 zabe@deploy1003: zabe: Continuing with deployment
* 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]]
== 2026-05-06 ==
* 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s)
* 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s)
* 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s)
* 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]]
* 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s)
* 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]]
* 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s)
* 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:10 cjming@deploy1003: cjming: Continuing with deployment
* 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]]
* 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s)
* 21:48 zabe@deploy1003: zabe: Continuing with deployment
* 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]]
* 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s)
* 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment
* 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]]
* 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie
* 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm
* 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:37 dzahn@dns1005: END - running authdns-update
* 18:35 dzahn@dns1005: START - running authdns-update
* 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1
* 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm
* 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work
* 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie
* 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
* 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm
* 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'"
* 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s)
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]]
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:26 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'"
* 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]]
* 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s)
* 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]]
* 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s)
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie
* 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]]
* 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:45 jgreen@dns1004: END - running authdns-update
* 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s)
* 13:44 jgreen@dns1004: START - running authdns-update
* 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors
* 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors
* 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]]
* 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie
* 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie
* 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie
* 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie
* 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie
* 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie
* 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie
* 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s)
* 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie
* 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]]
* 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:50 moritzm: installing openjdk-17 security updates
* 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie
* 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie
* 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie
* 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json
* 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json
* 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json
* 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json
* 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json
* 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update
* 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json
* 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json
* 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s)
* 08:59 zabe@deploy1003: zabe: Continuing with deployment
* 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]]
* 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json
* 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json
* 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:06 awight: EU morning deployment is done
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s)
* 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment
* 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
* 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]]
* 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s)
* 07:22 awight@deploy1003: awight, lilients: Continuing with deployment
* 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]]
* 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie
* 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet
* 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]]
* 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie
* 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie
* 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie
* 05:11 marostegui@dns1004: END - running authdns-update
* 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json
* 05:09 marostegui@dns1004: START - running authdns-update
* 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json
* 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json
* 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json
* 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s)
* 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]]
* 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s)
* 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001
* 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001
* 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]]
== 2026-05-05 ==
* 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s)
* 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]]
* 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s)
* 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]]
* 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s)
* 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]]
* 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s)
* 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]]
* 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s)
* 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]]
* 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s)
* 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment
* 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]]
* 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts
* 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s)
* 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s)
* 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment
* 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve
* 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]]
* 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s)
* 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie
* 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment
* 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]]
* 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002
* 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002
* 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox
* 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie
* 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts
* 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s)
* 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation
* 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s)
* 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]]
* 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s)
* 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw
* 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw
* 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad
* 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad
* 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003
* 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'"
* 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s)
* 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment
* 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb
* 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]]
* 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]]
* 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s)
* 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]]
* 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s)
* 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]]
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s)
* 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]]
* 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 15:39 dzahn@dns1005: END - running authdns-update
* 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore
* 15:37 dzahn@dns1005: START - running authdns-update
* 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s)
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json
* 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]]
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json
* 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s)
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]]
* 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s)
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json
* 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json
* 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]]
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json
* 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json
* 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad
* 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json
* 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet
* 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json
* 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw
* 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:03 Lucas_WMDE: UTC afternoon backport+config window done
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet
* 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json
* 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json
* 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling
* 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet
* 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s)
* 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json
* 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]]
* 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json
* 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling
* 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json
* 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json
* 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:30 Msz2001: UTC afternoon backport window done
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json
* 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet
* 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s)
* 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]]
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json
* 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json
* 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment
* 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]]
* 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s)
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json
* 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json
* 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s)
* 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]]
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json
* 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s)
* 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 12:42 moritzm: installing node-tar security updates
* 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]]
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json
* 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 12:36 moritzm: installing imagemagick security updates
* 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json
* 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json
* 12:04 moritzm: installing postgresql-13 security updates
* 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json
* 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s)
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json
* 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet
* 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]]
* 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s)
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json
* 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet
* 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet
* 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]]
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json
* 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json
* 11:10 moritzm: installing ca-certificates updates from bookworm point release
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie
* 11:07 moritzm: installing multipart bugfix updates from bookworm point release
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json
* 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie
* 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json
* 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json
* 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json
* 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json
* 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie
* 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json
* 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie
* 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json
* 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json
* 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie
* 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie
* 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json
* 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json
* 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json
* 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json
* 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie
* 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json
* 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json
* 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 08:50 moritzm: installing augeas security updates
* 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json
* 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json
* 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json
* 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json
* 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json
* 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]]
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie
* 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json
* 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]]
* 08:05 ayounsi@dns1004: END - running authdns-update
* 08:03 ayounsi@dns1004: START - running authdns-update
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json
* 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie
* 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json
* 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json
* 07:55 awight: EU morning deployment was fun
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json
* 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]]
* 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json
* 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]]
* 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]]
* 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]]
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie
* 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s)
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment
* 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]]
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie
* 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie
* 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie
* 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie
* 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s)
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s)
* 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]]
== 2026-05-04 ==
* 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]]
* 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s)
* 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment
* 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]]
* 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s)
* 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging
* 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s)
* 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]]
* 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s)
* 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment
* 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]]
* 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s)
* 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie
* 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment
* 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]]
* 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors
* 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting
* 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie
* 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s)
* 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]]
* 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s)
* 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]]
* 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s)
* 18:11 dancy@deploy1003: dancy: Rolling back deployment
* 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:09 dancy@deploy1003: Started scap sync-world: testing
* 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts
* 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s)
* 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s)
* 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]]
* 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json
* 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s)
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json
* 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment
* 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement
* 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]]
* 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json
* 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json
* 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json
* 15:10 papaul: ongoing switch refresh in ULSFO
* 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s)
* 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]]
* 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json
* 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts
* 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts
* 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json
* 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh
* 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh
* 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001
* 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json
* 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json
* 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s)
* 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable
* 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]]
* 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json
* 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s)
* 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment
* 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]]
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json
* 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json
* 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json
* 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json
* 13:13 moritzm: installing jaraco.context security updates
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json
* 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json
* 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json
* 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json
* 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json
* 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie
* 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json
* 10:48 moritzm: installing bash updates from trixie point release
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json
* 10:42 moritzm: installing postgresql-17 security updates
* 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json
* 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance
* 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json
* 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie
* 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json
* 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie
* 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json
* 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events
* 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events
* 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json
* 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json
* 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie
* 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s)
* 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync
* 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync
* 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts
* 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment
* 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie
* 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]]
* 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie
* 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie
* 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie
* 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic
* 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie
* 07:38 moritzm: installing Linux 6.12.85 on trixie hosts
* 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet
* 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie
* 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie
* 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie
* 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie
* 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie
* 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]]
* 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie
* 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie
* 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie
* 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-03 ==
* 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s)
* 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]]
* 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s)
* 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]]
== 2026-05-02 ==
* 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s)
* 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment
* 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]]
* 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s)
* 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment
* 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]]
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie
* 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie
* 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie
* 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie
* 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie
* 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s)
* 11:57 samtar@deploy1003: samtar: Continuing with deployment
* 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]]
* 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s)
* 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie
* 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie
* 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
== 2026-05-01 ==
* 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie
* 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie
* 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373
* 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372
* 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357
* 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie
* 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s)
* 20:02 krinkle@deploy1003: krinkle: Continuing with deployment
* 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]]
* 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s)
* 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]]
* 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts
* 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s)
* 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox
* 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie
* 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003
* 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie
* 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts
* 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s)
* 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
* 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts
* 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s)
* 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie
* 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 13:24 _Gerges: WikiMonitor setup
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s)
* 09:53 samtar@deploy1003: samtar: Continuing with deployment
* 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]]
* 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s)
* 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]]
* 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s)
* 00:13 zabe@deploy1003: zabe: Continuing with deployment
* 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
6debls8dd7k3utebrc4sw0c8aak9d32
2421403
2421402
2026-05-30T16:21:09Z
Stashbot
7414
arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
2421403
wikitext
text/x-wiki
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== 2026-05-20 ==
* 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s)
* 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]]
* 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s)
* 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]]
* 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]])
* 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s)
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]]
* 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet
* 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet
* 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet
* 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet
* 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet
* 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet
* 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet
* 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet
* 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet
* 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs
* 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet
* 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet
* 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet
* 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet
* 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet
* 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet
* 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp
* 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet
* 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet
* 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
* 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet
* 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet
* 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet
* 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s)
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment
* 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet
* 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet
* 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]]
* 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet
* 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet
* 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
* 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
* 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
* 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet
* 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet
* 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet
* 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:16 dwisehaupt@dns1005: END - running authdns-update
* 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance
* 20:15 dwisehaupt@dns1005: START - running authdns-update
* 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet
* 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet
* 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet
* 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet
* 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet
* 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet
* 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet
* 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s)
* 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet
* 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet
* 19:27 ejegg@deploy1003: ejegg: Continuing with deployment
* 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]]
* 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet
* 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s)
* 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet
* 18:45 reedy@deploy1003: reedy: Continuing with deployment
* 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]]
* 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 18:38 dwisehaupt@dns1004: END - running authdns-update
* 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:36 dwisehaupt@dns1004: START - running authdns-update
* 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet
* 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
* 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
* 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet
* 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet
* 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet
* 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s)
* 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet
* 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet
* 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml
* 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet
* 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet
* 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
* 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet
* 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet
* 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
* 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
* 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
* 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image
* 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh
* 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
* 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
* 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet
* 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
* 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet
* 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet
* 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet
* 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet
* 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet
* 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw
* 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp
* 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed
* 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh
* 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s)
* 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment
* 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]]
* 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
* 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet
* 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet
* 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
* 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet
* 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet
* 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot
* 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie
* 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable.
* 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet
* 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet
* 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh
* 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 moritzm: installing rsync security updates
* 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
* 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts
* 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s)
* 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet
* 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002
* 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed
* 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet
* 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:43 Lucas_WMDE: UTC afternoon backport+config window done
* 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet
* 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s)
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:38 reedy@deploy1003: reedy: Continuing with deployment
* 13:38 moritzm: installing krb5 security updates
* 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056
* 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]]
* 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056
* 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
* 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:22 root@cumin1003: START - Cookbook sre.dns.netbox
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed
* 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183
* 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]]
* 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie
* 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97)
* 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie
* 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s)
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 13:11 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]]
* 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s)
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 12:57 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]]
* 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart
* 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie
* 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s)
* 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:22 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]]
* 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed
* 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s)
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]]
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed
* 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
* 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet
* 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet
* 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet
* 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet
* 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048
* 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet
* 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet
* 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet
* 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet
* 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet
* 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet
* 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed
* 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 slyngshede@dns1004: END - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 10:21 slyngshede@dns1004: START - running authdns-update
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet
* 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet
* 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet
* 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet
* 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot
* 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet
* 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]]
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet
* 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet
* 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed
* 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet
* 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet
* 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
* 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet
* 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad
* 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]]
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org
* 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389
* 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947
* 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet
* 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947
* 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie
* 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet
* 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet
* 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org
* 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org
* 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed
* 08:26 moritzm: installing Java 11 security updates
* 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie
* 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release)
* 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot
* 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot
* 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot
* 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot
* 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot
* 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
* 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot
* 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s)
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 07:19 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]]
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s)
* 07:11 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 07:09 moritzm: remove haveged
* 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]]
* 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet
* 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet
* 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet
* 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart
* 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1
* 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.*
* 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet
* 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.*
* 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s)
* 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]]
* 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s)
* 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]]
* 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
== 2026-05-19 ==
* 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72
* 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]]
* 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet
* 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet
* 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
* 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036
* 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036
* 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder
* 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s)
* 20:51 sbassett@deploy1003: sbassett: Continuing with deployment
* 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]]
* 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s)
* 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]]
* 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s)
* 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment
* 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]]
* 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s)
* 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 19:53 otto@deploy1003: otto: Continuing with deployment
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s)
* 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]]
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d]
* 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s)
* 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d]
* 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s)
* 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d]
* 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org
* 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]]
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d]
* 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s)
* 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d]
* 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart
* 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart
* 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]]
* 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed
* 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83
* 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet
* 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet
* 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s)
* 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]]
* 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s)
* 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart
* 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]]
* 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s)
* 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed
* 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie
* 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001
* 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
* 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]]
* 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s)
* 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
* 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s)
* 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment
* 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json
* 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json
* 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet
* 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie
* 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie
* 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet
* 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet
* 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet
* 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet
* 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet
* 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
* 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 13:37 Lucas_WMDE: UTC afternoon backport+config window done
* 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 13:32 cscott@deploy1003: cscott: Continuing with deployment
* 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]]
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s)
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]]
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:12 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]]
* 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie
* 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie
* 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie
* 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie
* 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie
* 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet
* 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet
* 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
* 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie
* 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie
* 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
* 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet
* 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie
* 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet
* 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie
* 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie
* 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet
* 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet
* 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet
* 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet
* 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet
* 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
* 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
* 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet
* 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet
* 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004
* 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001
* 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie
* 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie
* 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet
* 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie
* 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
* 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie
* 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie
* 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet
* 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie
* 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet
* 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet
* 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie
* 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet
* 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
* 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie
* 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
* 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie
* 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
* 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet
* 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie
* 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie
* 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie
* 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie
* 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed
* 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
* 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
* 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401
* 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401
* 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet
* 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
* 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed
* 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
* 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie
* 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s)
* 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
* 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet
* 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]]
* 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008
* 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s)
* 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie
* 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet
* 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet
* 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 08:40 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet
* 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]]
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]]
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]]
* 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 08:24 Emperor: reboot apus codfw frontends (May reboots)
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet
* 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover
* 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]]
* 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 07:57 Emperor: reboot apus eqiad frontends (May reboots)
* 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]]
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet
* 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
* 07:33 XioNoX: add gnmic 0.46.0 to reprepro
* 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s)
* 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover
* 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:14 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover
* 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]]
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
* 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json
* 06:54 moritzm: installing qemu security updates
* 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json
* 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]]
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json
* 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]]
* 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
* 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet
* 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json
* 06:19 fceratto@dns1005: END - running authdns-update
* 06:18 fceratto@dns1005: START - running authdns-update
* 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json
* 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json
* 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]]
* 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json
* 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]]
* 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s)
* 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s)
* 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]]
* 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s)
* 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment
* 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]]
== 2026-05-18 ==
* 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
* 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
* 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s)
* 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]]
* 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
* 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet
* 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet
* 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s)
* 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment
* 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]]
* 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
* 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
* 21:16 mutante: gerrit-replica.wikimedia.org back online
* 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends
* 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:48 jhathaway@dns1004: END - running authdns-update
* 18:46 jhathaway@dns1004: START - running authdns-update
* 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet
* 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
* 18:26 herron: rebooting alert1002
* 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet
* 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet
* 18:16 mutante: releases.wikimedia.org - rebooting backends
* 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
* 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]]
* 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
* 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
* 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:46 herron: rebooting alert2002
* 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet
* 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
* 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet
* 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet
* 17:37 mutante: stewards* - rebooting
* 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
* 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet
* 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet
* 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet
* 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet
* 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:14 mutante: doc.wikimedia.org - rebooting backends
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet
* 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams
* 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
* 17:11 mutante: etherpad - rebooting backends
* 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet
* 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 17:04 mutante: contint2002, phab2002 - rebooting
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:32 mutante: zuul[12]00[123] / zuul* - rebooting
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade
* 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:27 mutante: people.wikimedia.org backend - rebooting
* 16:22 mutante: contint1003 - rebooting
* 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet
* 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw
* 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]]
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s)
* 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
* 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
* 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s)
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover
* 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover
* 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover
* 14:13 mlitn@deploy1003: Rolling back deployment
* 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]])
* 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:08 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]]
* 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163
* 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s)
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment
* 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
* 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]]
* 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s)
* 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad]
* 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s)
* 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad]
* 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s)
* 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad]
* 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
* 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
* 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]]
* 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s)
* 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]]
* 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s)
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
* 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s)
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 13:00 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]]
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json
* 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch`
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json
* 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]]
* 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json
* 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]]
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
* 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
* 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye
* 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye
* 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
* 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 11:21 slyngshede@dns1004: END - running authdns-update
* 11:19 slyngshede@dns1004: START - running authdns-update
* 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org
* 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org
* 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org
* 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet
* 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:56 slyngshede@dns1004: END - running authdns-update
* 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye
* 10:54 slyngshede@dns1004: START - running authdns-update
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye
* 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
* 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
* 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json
* 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw
* 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json
* 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json
* 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json
* 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json
* 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]]
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json
* 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]]
* 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye
* 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]]
* 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:18 moritzm: installing Java 21 security updates
* 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]]
* 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts
* 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082
* 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s)
* 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 09:03 ayounsi@dns1004: END - running authdns-update
* 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s)
* 09:01 ayounsi@dns1004: START - running authdns-update
* 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet
* 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082
* 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye
* 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]]
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 08:12 moritzm: installing glibc bugfix updates from bookworm point release
* 07:46 moritzm: installing systemd bugfix updates from bookworm point release
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:35 moritzm: installing openssl bugfix updates from bookworm point release
* 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet
* 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json
* 06:59 moritzm: installing systemd bugfix updates from trixie point release
* 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts
* 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts
* 06:49 moritzm: installing glibc bugfix updates from trixie point release
* 06:44 moritzm: installing openssl bugfix updates from trixie point release
* 06:33 moritzm: installing Linux 6.12.88 on trixie hosts
* 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4
== 2026-05-15 ==
* 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s)
* 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment
* 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]]
* 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm
* 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts
* 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet
* 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065
* 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065
* 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]]
* 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye
* 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065
* 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]]
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064
* 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064
* 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json
* 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json
* 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json
* 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065
* 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064
* 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064
* 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
* 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)
* 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-14 ==
* 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s)
* 21:43 egardner@deploy1003: egardner: Continuing with deployment
* 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]]
* 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s)
* 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]]
* 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm
* 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment
* 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]]
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s)
* 20:46 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]]
* 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s)
* 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm
* 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm
* 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]]
* 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm
* 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s)
* 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment
* 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]]
* 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm
* 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286
* 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm
* 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm
* 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:10 cmooney@dns2005: END - running authdns-update
* 17:09 cmooney@dns2005: START - running authdns-update
* 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s)
* 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:12 bearloga@deploy1003: bearloga: Continuing with deployment
* 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]]
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm
* 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289
* 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s)
* 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285
* 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:29 phuedx@deploy1003: phuedx: Continuing with deployment
* 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]]
* 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json
* 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm
* 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json
* 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s)
* 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json
* 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]]
* 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json
* 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s)
* 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm
* 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:56 mfossati@deploy1003: mfossati: Continuing with deployment
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]]
* 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json
* 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s)
* 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:45 krinkle@deploy1003: krinkle: Continuing with deployment
* 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm
* 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]]
* 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s)
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned
* 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned
* 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm
* 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]]
* 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283
* 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283
* 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover
* 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282
* 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]]
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json
* 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json
* 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281
* 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]]
* 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]])
* 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280
* 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json
* 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]]
* 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458
* 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458
* 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json
* 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json
* 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json
* 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye
* 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json
* 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye
* 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye
* 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned
* 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned
* 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned
* 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye
* 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye
* 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye
* 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye
* 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye
* 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json
* 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]])
* 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3
* 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie
* 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie
* 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
== 2026-05-13 ==
* 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s)
* 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]]
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s)
* 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]]
* 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s)
* 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]]
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]]
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s)
* 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment
* 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]]
* 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:20 cmooney@dns2005: END - running authdns-update
* 18:19 cmooney@dns2005: START - running authdns-update
* 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure
* 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul
* 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
* 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
* 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]])
* 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:16 cmooney@dns2005: END - running authdns-update
* 15:15 cmooney@dns2005: START - running authdns-update
* 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie
* 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s)
* 14:37 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]]
* 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s)
* 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:15 jforrester@deploy1003: jforrester: Continuing with deployment
* 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]]
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:08 Lucas_WMDE: UTC afternoon backport+config window done
* 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}}
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment
* 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}}
* 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie
* 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply
* 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}}
* 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s)
* 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]]
* {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}}
* 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment
* {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}}
* 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}}
* 13:25 moritzm: installing openjdk-11 security updates
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s)
* 13:07 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw
* 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]]
* 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s)
* 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]]
* 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
* 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]])
* 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]]
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]]
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie
* 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed
* 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie
* 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]]
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]]
* 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts
* 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie
* 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie
* 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie
* 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie
* 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts
* 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]]
* 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie
* 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:10 moritzm: installing Apache security updates on Bullseye
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie
* 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye
* 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie
* 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json
* 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye
* 09:56 moritzm: installing distro-info-data updates from Bookworm point release
* 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]]
* 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]]
* 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye
* 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json
* 09:51 moritzm: installing ca-certificates update from Bookworm point release
* 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye
* 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s)
* 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:41 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]]
* 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:28 cmooney@dns2005: END - running authdns-update
* 09:27 cmooney@dns2005: START - running authdns-update
* 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet
* 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage
* 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye
* 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye
* 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 moritzm: installing dnsmasq security updates
* 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 08:38 cmooney@dns2005: END - running authdns-update
* 08:37 cmooney@dns2005: START - running authdns-update
* 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s)
* 08:20 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]]
* 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]]
* 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s)
* 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment
* 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]]
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s)
* 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment
* 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie
* 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay
* 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie
* 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie
* 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie
* 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie
* 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie
* 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm
* 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277
* 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm
* 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276
* 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm
* 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s)
* 01:28 zabe@deploy1003: zabe: Continuing with deployment
* 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]]
* 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274
* 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-05-12 ==
* 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s)
* 23:40 cscott@deploy1003: cscott: Continuing with deployment
* 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]]
* 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s)
* 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm
* 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]]
* 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s)
* 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 dwisehaupt@dns1004: END - running authdns-update
* 21:57 dwisehaupt@dns1004: START - running authdns-update
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]]
* 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273
* 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm
* 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273
* 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment
* 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]]
* 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s)
* 21:15 cscott@deploy1003: cscott: Continuing with deployment
* 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change
* 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]]
* 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm
* 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s)
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment
* 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]]
* 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s)
* 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:20 dbrant@deploy1003: dbrant: Continuing with deployment
* 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]]
* 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s)
* 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]]
* 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s)
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment
* 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]]
* 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)
* 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:52 otto@deploy1003: otto: Continuing with deployment
* 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]
* 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 16:25 moritzm: installing Exim security updates on lists/vrts hosts
* 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s)
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]]
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]]
* 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s)
* 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye
* 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye
* 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye
* 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001
* 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001
* 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 14:15 Lucas_WMDE: UTC afternoon backport+config window done
* 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s)
* 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271
* 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]]
* 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye
* 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s)
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271
* 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272
* 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272
* 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]]
* 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]]
* 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot
* 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot
* 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s)
* 13:09 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]]
* 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}}
* 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced
* {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}}
* 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s)
* 12:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]]
* 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s)
* 09:51 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]]
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie
* 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie
* 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s)
* 08:00 dcausse@deploy1003: dcausse: Rolling back deployment
* 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie
* 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s)
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie
* 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet
* 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]]
* 06:27 jayme@dns1004: END - running authdns-update
* 06:26 jayme@dns1004: START - running authdns-update
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s)
* 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]]
== 2026-05-11 ==
* 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s)
* 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]]
* 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s)
* 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]]
* 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s)
* 21:47 cjming@deploy1003: cjming: Continuing with deployment
* 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]]
* 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]]
* 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment
* 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]]
* 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s)
* 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]]
* 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s)
* 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm
* 19:58 zabe@deploy1003: zabe: Continuing with deployment
* 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]]
* 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts
* 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper
* 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269
* 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:16 dzahn@dns1005: END - running authdns-update
* 19:14 dzahn@dns1005: START - running authdns-update
* 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space
* 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]]
* 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json
* 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268
* 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json
* 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json
* 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts
* 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s)
* 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json
* 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json
* 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s)
* 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:23 zabe@deploy1003: zabe: Continuing with deployment
* 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]]
* 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s)
* 15:54 zabe@deploy1003: zabe: Continuing with deployment
* 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s)
* 15:42 zabe@deploy1003: zabe: Continuing with deployment
* 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]]
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017
* 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017
* 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 Lucas_WMDE: UTC afternoon backport+config window done
* 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18
* 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment
* {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}}
* 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]]
* {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}}
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment
* {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}}
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
* 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw
* 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}}
* 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]]
* 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s)
* 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
* 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie
* 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]]
* 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)
* 13:06 elukey: remove old discovery pki intermediate
* 13:03 otto@deploy1003: otto: Continuing with deployment
* 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]
* 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s)
* 12:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]]
* 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]])
* 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot
* 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts
* 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s)
* 11:21 jayme@deploy1003: Rolling back deployment
* 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]]
* 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]]
* 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts
* 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s)
* 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image
* 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:16 slyngs: Migrate of lvs2012 due to hardware issues
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s)
* 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]]
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]]
* 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]]
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json
* 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]]
* 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json
* 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 08:10 slyngshede@dns1004: END - running authdns-update
* 08:08 slyngshede@dns1004: START - running authdns-update
* 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors
* 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet
* 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org
* 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org
* 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-10 ==
* 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-09 ==
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm
* 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-08 ==
* 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267
* 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm
* 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266
* 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm
* 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265
* 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/
* 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health
* 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps
* 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad
* 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart
* 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json
* 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json
* 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json
* 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie
* 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org
* 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie
* 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie
* 06:11 moritzm: installing postorius security updates
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie
* 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024
* 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023
* 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox
== 2026-05-07 ==
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s)
* 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19]
* 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s)
* 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19]
* 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s)
* 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19]
* {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}}
* 21:23 cscott@deploy1003: cscott: Continuing with deployment
* 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t
* {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}}
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s)
* 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment
* 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]]
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki
* 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s)
* 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]]
* 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie
* 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022
* 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:06 cdanis@dns1005: END - running authdns-update
* 18:04 cdanis@dns1005: START - running authdns-update
* 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s)
* 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis
* 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 17:51 krinkle@deploy1003: krinkle: Continuing with deployment
* 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]]
* 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart
* 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 16:32 jynus: restarting backup1-* database primary hosts
* 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart
* 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart
* 16:14 sukhe@dns1004: END - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
* 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
* 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet
* 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs
* 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
* 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s)
* 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c]
* 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s)
* 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
* 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c]
* 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s)
* 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:32 slyngshede@dns1004: END - running authdns-update
* 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c]
* 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 14:30 slyngshede@dns1004: START - running authdns-update
* 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train
* 14:30 jmm@dns1004: END - running authdns-update
* 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 14:28 jmm@dns1004: START - running authdns-update
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox
* 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
* 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
* 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s)
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]]
* 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s)
* 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment
* 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]]
* 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 12:45 sukhe@dns1004: FAIL - running authdns-update
* 12:44 sukhe@dns1004: START - running authdns-update
* 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm
* 12:23 slyngshede@dns1004: FAIL - running authdns-update
* 12:21 slyngshede@dns1004: START - running authdns-update
* 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:12 slyngshede@dns1004: FAIL - running authdns-update
* 12:11 slyngshede@dns1004: START - running authdns-update
* 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie
* 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie
* 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage
* 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie
* 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie
* 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json
* 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie
* 11:11 moritzm: instaling modsecurity-apache security updates
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s)
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:54 root@cumin1003: START - Cookbook sre.dns.netbox
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json
* 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184
* 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie
* 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]]
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json
* 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage
* 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json
* 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie
* 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie
* 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie
* 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie
* 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie
* 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org
* 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]]
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie
* 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org
* 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie
* 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie
* 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie
* 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie
* 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org
* 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie
* 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json
* 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage
* 08:23 XioNoX: drmrs remove old v6 gateway IP
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s)
* 07:49 dcausse@deploy1003: dcausse: Continuing with deployment
* 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]]
* 07:32 moritzm: installing apache2 security updates
* 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie
* 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie
* 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie
* 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie
* 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json
* 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json
* 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]]
* 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]]
* 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json
* 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s)
* 01:09 zabe@deploy1003: zabe: Continuing with deployment
* 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]]
* 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s)
* 00:31 zabe@deploy1003: zabe: Continuing with deployment
* 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]]
== 2026-05-06 ==
* 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s)
* 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s)
* 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s)
* 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]]
* 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s)
* 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]]
* 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s)
* 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:10 cjming@deploy1003: cjming: Continuing with deployment
* 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]]
* 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s)
* 21:48 zabe@deploy1003: zabe: Continuing with deployment
* 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]]
* 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s)
* 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment
* 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]]
* 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie
* 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm
* 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:37 dzahn@dns1005: END - running authdns-update
* 18:35 dzahn@dns1005: START - running authdns-update
* 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1
* 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm
* 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work
* 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie
* 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
* 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm
* 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'"
* 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s)
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]]
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:26 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'"
* 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]]
* 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s)
* 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]]
* 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s)
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie
* 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]]
* 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:45 jgreen@dns1004: END - running authdns-update
* 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s)
* 13:44 jgreen@dns1004: START - running authdns-update
* 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors
* 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors
* 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]]
* 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie
* 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie
* 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie
* 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie
* 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie
* 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie
* 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie
* 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s)
* 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie
* 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]]
* 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:50 moritzm: installing openjdk-17 security updates
* 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie
* 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie
* 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie
* 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json
* 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json
* 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json
* 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json
* 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json
* 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update
* 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json
* 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json
* 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s)
* 08:59 zabe@deploy1003: zabe: Continuing with deployment
* 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]]
* 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json
* 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json
* 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:06 awight: EU morning deployment is done
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s)
* 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment
* 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
* 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]]
* 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s)
* 07:22 awight@deploy1003: awight, lilients: Continuing with deployment
* 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]]
* 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie
* 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet
* 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]]
* 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie
* 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie
* 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie
* 05:11 marostegui@dns1004: END - running authdns-update
* 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json
* 05:09 marostegui@dns1004: START - running authdns-update
* 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json
* 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json
* 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json
* 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s)
* 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]]
* 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s)
* 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001
* 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001
* 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]]
== 2026-05-05 ==
* 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s)
* 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]]
* 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s)
* 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]]
* 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s)
* 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]]
* 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s)
* 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]]
* 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s)
* 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]]
* 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s)
* 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment
* 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]]
* 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts
* 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s)
* 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s)
* 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment
* 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve
* 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]]
* 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s)
* 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie
* 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment
* 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]]
* 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002
* 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002
* 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox
* 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie
* 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts
* 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s)
* 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation
* 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s)
* 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]]
* 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s)
* 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw
* 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw
* 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad
* 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad
* 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003
* 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'"
* 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s)
* 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment
* 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb
* 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]]
* 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]]
* 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s)
* 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]]
* 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s)
* 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]]
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s)
* 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]]
* 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 15:39 dzahn@dns1005: END - running authdns-update
* 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore
* 15:37 dzahn@dns1005: START - running authdns-update
* 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s)
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json
* 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]]
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json
* 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s)
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]]
* 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s)
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json
* 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json
* 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]]
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json
* 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json
* 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad
* 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json
* 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet
* 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json
* 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw
* 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:03 Lucas_WMDE: UTC afternoon backport+config window done
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet
* 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json
* 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json
* 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling
* 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet
* 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s)
* 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json
* 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]]
* 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json
* 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling
* 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json
* 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json
* 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:30 Msz2001: UTC afternoon backport window done
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json
* 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet
* 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s)
* 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]]
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json
* 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json
* 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment
* 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]]
* 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s)
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json
* 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json
* 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s)
* 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]]
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json
* 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s)
* 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 12:42 moritzm: installing node-tar security updates
* 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]]
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json
* 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 12:36 moritzm: installing imagemagick security updates
* 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json
* 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json
* 12:04 moritzm: installing postgresql-13 security updates
* 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json
* 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s)
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json
* 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet
* 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]]
* 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s)
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json
* 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet
* 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet
* 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]]
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json
* 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json
* 11:10 moritzm: installing ca-certificates updates from bookworm point release
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie
* 11:07 moritzm: installing multipart bugfix updates from bookworm point release
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json
* 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie
* 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json
* 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json
* 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json
* 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json
* 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie
* 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json
* 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie
* 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json
* 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json
* 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie
* 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie
* 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json
* 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json
* 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json
* 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json
* 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie
* 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json
* 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json
* 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 08:50 moritzm: installing augeas security updates
* 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json
* 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json
* 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json
* 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json
* 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json
* 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]]
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie
* 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json
* 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]]
* 08:05 ayounsi@dns1004: END - running authdns-update
* 08:03 ayounsi@dns1004: START - running authdns-update
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json
* 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie
* 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json
* 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json
* 07:55 awight: EU morning deployment was fun
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json
* 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]]
* 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json
* 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]]
* 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]]
* 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]]
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie
* 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s)
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment
* 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]]
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie
* 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie
* 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie
* 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie
* 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s)
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s)
* 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]]
== 2026-05-04 ==
* 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]]
* 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s)
* 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment
* 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]]
* 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s)
* 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging
* 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s)
* 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]]
* 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s)
* 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment
* 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]]
* 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s)
* 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie
* 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment
* 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]]
* 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors
* 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting
* 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie
* 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s)
* 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]]
* 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s)
* 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]]
* 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s)
* 18:11 dancy@deploy1003: dancy: Rolling back deployment
* 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:09 dancy@deploy1003: Started scap sync-world: testing
* 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts
* 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s)
* 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s)
* 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]]
* 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json
* 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s)
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json
* 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment
* 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement
* 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]]
* 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json
* 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json
* 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json
* 15:10 papaul: ongoing switch refresh in ULSFO
* 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s)
* 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]]
* 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json
* 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts
* 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts
* 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json
* 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh
* 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh
* 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001
* 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json
* 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json
* 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s)
* 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable
* 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]]
* 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json
* 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s)
* 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment
* 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]]
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json
* 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json
* 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json
* 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json
* 13:13 moritzm: installing jaraco.context security updates
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json
* 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json
* 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json
* 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json
* 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json
* 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie
* 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json
* 10:48 moritzm: installing bash updates from trixie point release
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json
* 10:42 moritzm: installing postgresql-17 security updates
* 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json
* 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance
* 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json
* 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie
* 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json
* 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie
* 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json
* 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events
* 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events
* 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json
* 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json
* 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie
* 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s)
* 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync
* 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync
* 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts
* 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment
* 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie
* 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]]
* 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie
* 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie
* 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie
* 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic
* 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie
* 07:38 moritzm: installing Linux 6.12.85 on trixie hosts
* 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet
* 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie
* 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie
* 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie
* 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie
* 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie
* 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]]
* 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie
* 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie
* 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie
* 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-03 ==
* 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s)
* 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]]
* 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s)
* 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]]
== 2026-05-02 ==
* 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s)
* 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment
* 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]]
* 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s)
* 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment
* 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]]
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie
* 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie
* 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie
* 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie
* 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie
* 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s)
* 11:57 samtar@deploy1003: samtar: Continuing with deployment
* 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]]
* 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s)
* 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie
* 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie
* 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
== 2026-05-01 ==
* 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie
* 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie
* 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373
* 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372
* 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357
* 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie
* 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s)
* 20:02 krinkle@deploy1003: krinkle: Continuing with deployment
* 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]]
* 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s)
* 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]]
* 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts
* 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s)
* 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox
* 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie
* 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003
* 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie
* 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts
* 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s)
* 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
* 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts
* 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s)
* 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie
* 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 13:24 _Gerges: WikiMonitor setup
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s)
* 09:53 samtar@deploy1003: samtar: Continuing with deployment
* 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]]
* 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s)
* 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]]
* 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s)
* 00:13 zabe@deploy1003: zabe: Continuing with deployment
* 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
nofnccgxlc496lb7gt8ws01r8ht8zrl
2421404
2421403
2026-05-30T16:21:13Z
Stashbot
7414
arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
2421404
wikitext
text/x-wiki
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== 2026-05-20 ==
* 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s)
* 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]]
* 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s)
* 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]]
* 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]])
* 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s)
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]]
* 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet
* 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet
* 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet
* 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet
* 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet
* 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet
* 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet
* 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet
* 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet
* 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs
* 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet
* 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet
* 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet
* 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet
* 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet
* 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet
* 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp
* 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet
* 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet
* 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
* 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet
* 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet
* 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet
* 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s)
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment
* 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet
* 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet
* 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]]
* 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet
* 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet
* 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
* 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
* 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
* 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet
* 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet
* 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet
* 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:16 dwisehaupt@dns1005: END - running authdns-update
* 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance
* 20:15 dwisehaupt@dns1005: START - running authdns-update
* 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet
* 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet
* 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet
* 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet
* 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet
* 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet
* 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet
* 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s)
* 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet
* 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet
* 19:27 ejegg@deploy1003: ejegg: Continuing with deployment
* 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]]
* 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet
* 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s)
* 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet
* 18:45 reedy@deploy1003: reedy: Continuing with deployment
* 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]]
* 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 18:38 dwisehaupt@dns1004: END - running authdns-update
* 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:36 dwisehaupt@dns1004: START - running authdns-update
* 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet
* 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
* 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
* 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet
* 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet
* 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet
* 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s)
* 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet
* 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet
* 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml
* 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet
* 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet
* 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
* 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet
* 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet
* 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
* 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
* 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
* 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image
* 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh
* 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
* 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
* 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet
* 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
* 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet
* 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet
* 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet
* 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet
* 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet
* 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw
* 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp
* 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed
* 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh
* 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s)
* 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment
* 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]]
* 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
* 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet
* 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet
* 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
* 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet
* 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet
* 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot
* 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie
* 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable.
* 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet
* 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet
* 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh
* 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 moritzm: installing rsync security updates
* 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
* 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts
* 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s)
* 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet
* 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002
* 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed
* 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet
* 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:43 Lucas_WMDE: UTC afternoon backport+config window done
* 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet
* 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s)
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:38 reedy@deploy1003: reedy: Continuing with deployment
* 13:38 moritzm: installing krb5 security updates
* 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056
* 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]]
* 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056
* 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
* 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:22 root@cumin1003: START - Cookbook sre.dns.netbox
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed
* 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183
* 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]]
* 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie
* 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97)
* 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie
* 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s)
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 13:11 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]]
* 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s)
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 12:57 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]]
* 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart
* 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie
* 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s)
* 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:22 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]]
* 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed
* 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s)
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]]
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed
* 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
* 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet
* 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet
* 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet
* 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet
* 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048
* 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet
* 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet
* 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet
* 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet
* 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet
* 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet
* 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed
* 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 slyngshede@dns1004: END - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 10:21 slyngshede@dns1004: START - running authdns-update
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet
* 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet
* 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet
* 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet
* 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot
* 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet
* 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]]
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet
* 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet
* 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed
* 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet
* 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet
* 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
* 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet
* 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad
* 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]]
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org
* 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389
* 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947
* 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet
* 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947
* 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie
* 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet
* 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet
* 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org
* 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org
* 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed
* 08:26 moritzm: installing Java 11 security updates
* 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie
* 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release)
* 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot
* 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot
* 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot
* 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot
* 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot
* 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
* 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot
* 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s)
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 07:19 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]]
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s)
* 07:11 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 07:09 moritzm: remove haveged
* 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]]
* 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet
* 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet
* 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet
* 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart
* 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1
* 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.*
* 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet
* 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.*
* 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s)
* 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]]
* 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s)
* 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]]
* 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
== 2026-05-19 ==
* 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72
* 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]]
* 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet
* 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet
* 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
* 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036
* 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036
* 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder
* 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s)
* 20:51 sbassett@deploy1003: sbassett: Continuing with deployment
* 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]]
* 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s)
* 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]]
* 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s)
* 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment
* 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]]
* 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s)
* 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 19:53 otto@deploy1003: otto: Continuing with deployment
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s)
* 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]]
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d]
* 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s)
* 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d]
* 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s)
* 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d]
* 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org
* 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]]
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d]
* 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s)
* 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d]
* 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart
* 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart
* 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]]
* 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed
* 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83
* 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet
* 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet
* 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s)
* 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]]
* 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s)
* 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart
* 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]]
* 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s)
* 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed
* 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie
* 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001
* 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
* 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]]
* 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s)
* 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
* 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s)
* 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment
* 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json
* 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json
* 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet
* 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie
* 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie
* 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet
* 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet
* 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet
* 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet
* 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet
* 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
* 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 13:37 Lucas_WMDE: UTC afternoon backport+config window done
* 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 13:32 cscott@deploy1003: cscott: Continuing with deployment
* 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]]
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s)
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]]
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:12 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]]
* 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie
* 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie
* 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie
* 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie
* 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie
* 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet
* 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet
* 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
* 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie
* 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie
* 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
* 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet
* 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie
* 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet
* 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie
* 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie
* 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet
* 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet
* 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet
* 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet
* 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet
* 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
* 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
* 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet
* 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet
* 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004
* 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001
* 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie
* 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie
* 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet
* 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie
* 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
* 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie
* 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie
* 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet
* 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie
* 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet
* 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet
* 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie
* 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet
* 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
* 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie
* 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
* 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie
* 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
* 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet
* 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie
* 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie
* 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie
* 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie
* 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed
* 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
* 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
* 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401
* 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401
* 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet
* 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
* 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed
* 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
* 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie
* 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s)
* 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
* 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet
* 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]]
* 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008
* 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s)
* 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie
* 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet
* 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet
* 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 08:40 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet
* 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]]
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]]
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]]
* 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 08:24 Emperor: reboot apus codfw frontends (May reboots)
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet
* 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover
* 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]]
* 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 07:57 Emperor: reboot apus eqiad frontends (May reboots)
* 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]]
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet
* 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
* 07:33 XioNoX: add gnmic 0.46.0 to reprepro
* 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s)
* 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover
* 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:14 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover
* 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]]
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
* 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json
* 06:54 moritzm: installing qemu security updates
* 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json
* 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]]
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json
* 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]]
* 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
* 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet
* 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json
* 06:19 fceratto@dns1005: END - running authdns-update
* 06:18 fceratto@dns1005: START - running authdns-update
* 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json
* 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json
* 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]]
* 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json
* 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]]
* 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s)
* 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s)
* 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]]
* 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s)
* 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment
* 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]]
== 2026-05-18 ==
* 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
* 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
* 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s)
* 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]]
* 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
* 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet
* 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet
* 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s)
* 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment
* 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]]
* 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
* 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
* 21:16 mutante: gerrit-replica.wikimedia.org back online
* 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends
* 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:48 jhathaway@dns1004: END - running authdns-update
* 18:46 jhathaway@dns1004: START - running authdns-update
* 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet
* 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
* 18:26 herron: rebooting alert1002
* 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet
* 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet
* 18:16 mutante: releases.wikimedia.org - rebooting backends
* 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
* 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]]
* 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
* 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
* 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:46 herron: rebooting alert2002
* 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet
* 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
* 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet
* 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet
* 17:37 mutante: stewards* - rebooting
* 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
* 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet
* 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet
* 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet
* 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet
* 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:14 mutante: doc.wikimedia.org - rebooting backends
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet
* 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams
* 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
* 17:11 mutante: etherpad - rebooting backends
* 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet
* 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 17:04 mutante: contint2002, phab2002 - rebooting
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:32 mutante: zuul[12]00[123] / zuul* - rebooting
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade
* 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:27 mutante: people.wikimedia.org backend - rebooting
* 16:22 mutante: contint1003 - rebooting
* 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet
* 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw
* 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]]
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s)
* 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
* 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
* 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s)
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover
* 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover
* 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover
* 14:13 mlitn@deploy1003: Rolling back deployment
* 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]])
* 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:08 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]]
* 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163
* 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s)
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment
* 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
* 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]]
* 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s)
* 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad]
* 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s)
* 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad]
* 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s)
* 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad]
* 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
* 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
* 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]]
* 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s)
* 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]]
* 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s)
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
* 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s)
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 13:00 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]]
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json
* 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch`
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json
* 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]]
* 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json
* 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]]
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
* 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
* 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye
* 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye
* 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
* 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 11:21 slyngshede@dns1004: END - running authdns-update
* 11:19 slyngshede@dns1004: START - running authdns-update
* 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org
* 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org
* 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org
* 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet
* 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:56 slyngshede@dns1004: END - running authdns-update
* 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye
* 10:54 slyngshede@dns1004: START - running authdns-update
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye
* 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
* 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
* 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json
* 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw
* 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json
* 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json
* 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json
* 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json
* 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]]
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json
* 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]]
* 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye
* 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]]
* 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:18 moritzm: installing Java 21 security updates
* 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]]
* 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts
* 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082
* 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s)
* 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 09:03 ayounsi@dns1004: END - running authdns-update
* 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s)
* 09:01 ayounsi@dns1004: START - running authdns-update
* 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet
* 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082
* 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye
* 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]]
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 08:12 moritzm: installing glibc bugfix updates from bookworm point release
* 07:46 moritzm: installing systemd bugfix updates from bookworm point release
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:35 moritzm: installing openssl bugfix updates from bookworm point release
* 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet
* 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json
* 06:59 moritzm: installing systemd bugfix updates from trixie point release
* 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts
* 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts
* 06:49 moritzm: installing glibc bugfix updates from trixie point release
* 06:44 moritzm: installing openssl bugfix updates from trixie point release
* 06:33 moritzm: installing Linux 6.12.88 on trixie hosts
* 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4
== 2026-05-15 ==
* 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s)
* 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment
* 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]]
* 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm
* 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts
* 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet
* 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065
* 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065
* 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]]
* 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye
* 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065
* 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]]
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064
* 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064
* 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json
* 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json
* 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json
* 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065
* 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064
* 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064
* 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
* 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)
* 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-14 ==
* 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s)
* 21:43 egardner@deploy1003: egardner: Continuing with deployment
* 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]]
* 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s)
* 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]]
* 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm
* 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment
* 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]]
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s)
* 20:46 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]]
* 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s)
* 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm
* 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm
* 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]]
* 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm
* 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s)
* 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment
* 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]]
* 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm
* 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286
* 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm
* 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm
* 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:10 cmooney@dns2005: END - running authdns-update
* 17:09 cmooney@dns2005: START - running authdns-update
* 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s)
* 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:12 bearloga@deploy1003: bearloga: Continuing with deployment
* 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]]
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm
* 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289
* 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s)
* 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285
* 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:29 phuedx@deploy1003: phuedx: Continuing with deployment
* 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]]
* 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json
* 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm
* 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json
* 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s)
* 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json
* 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]]
* 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json
* 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s)
* 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm
* 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:56 mfossati@deploy1003: mfossati: Continuing with deployment
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]]
* 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json
* 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s)
* 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:45 krinkle@deploy1003: krinkle: Continuing with deployment
* 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm
* 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]]
* 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s)
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned
* 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned
* 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm
* 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]]
* 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283
* 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283
* 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover
* 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282
* 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]]
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json
* 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json
* 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281
* 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]]
* 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]])
* 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280
* 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json
* 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]]
* 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458
* 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458
* 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json
* 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json
* 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json
* 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye
* 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json
* 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye
* 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye
* 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned
* 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned
* 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned
* 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye
* 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye
* 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye
* 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye
* 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye
* 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json
* 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]])
* 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3
* 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie
* 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie
* 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
== 2026-05-13 ==
* 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s)
* 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]]
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s)
* 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]]
* 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s)
* 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]]
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]]
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s)
* 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment
* 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]]
* 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:20 cmooney@dns2005: END - running authdns-update
* 18:19 cmooney@dns2005: START - running authdns-update
* 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure
* 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul
* 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
* 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
* 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]])
* 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:16 cmooney@dns2005: END - running authdns-update
* 15:15 cmooney@dns2005: START - running authdns-update
* 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie
* 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s)
* 14:37 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]]
* 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s)
* 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:15 jforrester@deploy1003: jforrester: Continuing with deployment
* 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]]
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:08 Lucas_WMDE: UTC afternoon backport+config window done
* 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}}
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment
* 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}}
* 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie
* 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply
* 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}}
* 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s)
* 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]]
* {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}}
* 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment
* {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}}
* 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}}
* 13:25 moritzm: installing openjdk-11 security updates
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s)
* 13:07 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw
* 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]]
* 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s)
* 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]]
* 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
* 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]])
* 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]]
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]]
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie
* 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed
* 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie
* 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]]
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]]
* 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts
* 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie
* 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie
* 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie
* 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie
* 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts
* 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]]
* 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie
* 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:10 moritzm: installing Apache security updates on Bullseye
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie
* 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye
* 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie
* 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json
* 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye
* 09:56 moritzm: installing distro-info-data updates from Bookworm point release
* 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]]
* 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]]
* 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye
* 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json
* 09:51 moritzm: installing ca-certificates update from Bookworm point release
* 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye
* 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s)
* 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:41 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]]
* 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:28 cmooney@dns2005: END - running authdns-update
* 09:27 cmooney@dns2005: START - running authdns-update
* 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet
* 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage
* 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye
* 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye
* 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 moritzm: installing dnsmasq security updates
* 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 08:38 cmooney@dns2005: END - running authdns-update
* 08:37 cmooney@dns2005: START - running authdns-update
* 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s)
* 08:20 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]]
* 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]]
* 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s)
* 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment
* 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]]
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s)
* 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment
* 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie
* 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay
* 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie
* 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie
* 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie
* 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie
* 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie
* 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm
* 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277
* 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm
* 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276
* 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm
* 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s)
* 01:28 zabe@deploy1003: zabe: Continuing with deployment
* 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]]
* 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274
* 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-05-12 ==
* 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s)
* 23:40 cscott@deploy1003: cscott: Continuing with deployment
* 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]]
* 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s)
* 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm
* 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]]
* 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s)
* 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 dwisehaupt@dns1004: END - running authdns-update
* 21:57 dwisehaupt@dns1004: START - running authdns-update
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]]
* 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273
* 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm
* 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273
* 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment
* 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]]
* 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s)
* 21:15 cscott@deploy1003: cscott: Continuing with deployment
* 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change
* 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]]
* 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm
* 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s)
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment
* 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]]
* 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s)
* 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:20 dbrant@deploy1003: dbrant: Continuing with deployment
* 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]]
* 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s)
* 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]]
* 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s)
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment
* 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]]
* 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)
* 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:52 otto@deploy1003: otto: Continuing with deployment
* 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]
* 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 16:25 moritzm: installing Exim security updates on lists/vrts hosts
* 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s)
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]]
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]]
* 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s)
* 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye
* 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye
* 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye
* 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001
* 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001
* 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 14:15 Lucas_WMDE: UTC afternoon backport+config window done
* 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s)
* 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271
* 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]]
* 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye
* 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s)
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271
* 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272
* 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272
* 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]]
* 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]]
* 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot
* 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot
* 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s)
* 13:09 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]]
* 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}}
* 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced
* {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}}
* 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s)
* 12:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]]
* 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s)
* 09:51 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]]
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie
* 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie
* 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s)
* 08:00 dcausse@deploy1003: dcausse: Rolling back deployment
* 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie
* 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s)
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie
* 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet
* 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]]
* 06:27 jayme@dns1004: END - running authdns-update
* 06:26 jayme@dns1004: START - running authdns-update
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s)
* 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]]
== 2026-05-11 ==
* 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s)
* 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]]
* 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s)
* 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]]
* 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s)
* 21:47 cjming@deploy1003: cjming: Continuing with deployment
* 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]]
* 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]]
* 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment
* 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]]
* 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s)
* 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]]
* 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s)
* 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm
* 19:58 zabe@deploy1003: zabe: Continuing with deployment
* 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]]
* 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts
* 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper
* 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269
* 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:16 dzahn@dns1005: END - running authdns-update
* 19:14 dzahn@dns1005: START - running authdns-update
* 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space
* 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]]
* 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json
* 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268
* 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json
* 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json
* 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts
* 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s)
* 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json
* 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json
* 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s)
* 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:23 zabe@deploy1003: zabe: Continuing with deployment
* 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]]
* 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s)
* 15:54 zabe@deploy1003: zabe: Continuing with deployment
* 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s)
* 15:42 zabe@deploy1003: zabe: Continuing with deployment
* 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]]
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017
* 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017
* 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 Lucas_WMDE: UTC afternoon backport+config window done
* 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18
* 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment
* {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}}
* 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]]
* {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}}
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment
* {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}}
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
* 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw
* 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}}
* 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]]
* 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s)
* 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
* 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie
* 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]]
* 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)
* 13:06 elukey: remove old discovery pki intermediate
* 13:03 otto@deploy1003: otto: Continuing with deployment
* 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]
* 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s)
* 12:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]]
* 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]])
* 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot
* 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts
* 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s)
* 11:21 jayme@deploy1003: Rolling back deployment
* 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]]
* 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]]
* 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts
* 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s)
* 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image
* 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:16 slyngs: Migrate of lvs2012 due to hardware issues
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s)
* 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]]
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]]
* 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]]
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json
* 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]]
* 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json
* 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 08:10 slyngshede@dns1004: END - running authdns-update
* 08:08 slyngshede@dns1004: START - running authdns-update
* 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors
* 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet
* 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org
* 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org
* 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-10 ==
* 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-09 ==
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm
* 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-08 ==
* 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267
* 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm
* 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266
* 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm
* 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265
* 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/
* 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health
* 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps
* 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad
* 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart
* 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json
* 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json
* 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json
* 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie
* 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org
* 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie
* 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie
* 06:11 moritzm: installing postorius security updates
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie
* 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024
* 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023
* 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox
== 2026-05-07 ==
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s)
* 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19]
* 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s)
* 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19]
* 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s)
* 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19]
* {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}}
* 21:23 cscott@deploy1003: cscott: Continuing with deployment
* 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t
* {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}}
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s)
* 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment
* 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]]
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki
* 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s)
* 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]]
* 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie
* 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022
* 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:06 cdanis@dns1005: END - running authdns-update
* 18:04 cdanis@dns1005: START - running authdns-update
* 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s)
* 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis
* 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 17:51 krinkle@deploy1003: krinkle: Continuing with deployment
* 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]]
* 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart
* 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 16:32 jynus: restarting backup1-* database primary hosts
* 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart
* 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart
* 16:14 sukhe@dns1004: END - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
* 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
* 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet
* 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs
* 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
* 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s)
* 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c]
* 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s)
* 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
* 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c]
* 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s)
* 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:32 slyngshede@dns1004: END - running authdns-update
* 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c]
* 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 14:30 slyngshede@dns1004: START - running authdns-update
* 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train
* 14:30 jmm@dns1004: END - running authdns-update
* 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 14:28 jmm@dns1004: START - running authdns-update
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox
* 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
* 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
* 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s)
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]]
* 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s)
* 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment
* 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]]
* 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 12:45 sukhe@dns1004: FAIL - running authdns-update
* 12:44 sukhe@dns1004: START - running authdns-update
* 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm
* 12:23 slyngshede@dns1004: FAIL - running authdns-update
* 12:21 slyngshede@dns1004: START - running authdns-update
* 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:12 slyngshede@dns1004: FAIL - running authdns-update
* 12:11 slyngshede@dns1004: START - running authdns-update
* 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie
* 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie
* 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage
* 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie
* 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie
* 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json
* 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie
* 11:11 moritzm: instaling modsecurity-apache security updates
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s)
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:54 root@cumin1003: START - Cookbook sre.dns.netbox
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json
* 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184
* 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie
* 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]]
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json
* 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage
* 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json
* 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie
* 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie
* 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie
* 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie
* 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie
* 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org
* 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]]
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie
* 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org
* 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie
* 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie
* 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie
* 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie
* 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org
* 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie
* 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json
* 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage
* 08:23 XioNoX: drmrs remove old v6 gateway IP
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s)
* 07:49 dcausse@deploy1003: dcausse: Continuing with deployment
* 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]]
* 07:32 moritzm: installing apache2 security updates
* 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie
* 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie
* 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie
* 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie
* 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json
* 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json
* 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]]
* 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]]
* 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json
* 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s)
* 01:09 zabe@deploy1003: zabe: Continuing with deployment
* 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]]
* 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s)
* 00:31 zabe@deploy1003: zabe: Continuing with deployment
* 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]]
== 2026-05-06 ==
* 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s)
* 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s)
* 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s)
* 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]]
* 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s)
* 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]]
* 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s)
* 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:10 cjming@deploy1003: cjming: Continuing with deployment
* 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]]
* 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s)
* 21:48 zabe@deploy1003: zabe: Continuing with deployment
* 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]]
* 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s)
* 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment
* 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]]
* 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie
* 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm
* 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:37 dzahn@dns1005: END - running authdns-update
* 18:35 dzahn@dns1005: START - running authdns-update
* 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1
* 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm
* 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work
* 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie
* 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
* 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm
* 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'"
* 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s)
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]]
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:26 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'"
* 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]]
* 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s)
* 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]]
* 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s)
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie
* 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]]
* 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:45 jgreen@dns1004: END - running authdns-update
* 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s)
* 13:44 jgreen@dns1004: START - running authdns-update
* 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors
* 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors
* 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]]
* 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie
* 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie
* 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie
* 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie
* 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie
* 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie
* 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie
* 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s)
* 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie
* 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]]
* 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:50 moritzm: installing openjdk-17 security updates
* 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie
* 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie
* 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie
* 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json
* 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json
* 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json
* 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json
* 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json
* 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update
* 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json
* 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json
* 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s)
* 08:59 zabe@deploy1003: zabe: Continuing with deployment
* 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]]
* 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json
* 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json
* 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:06 awight: EU morning deployment is done
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s)
* 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment
* 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
* 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]]
* 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s)
* 07:22 awight@deploy1003: awight, lilients: Continuing with deployment
* 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]]
* 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie
* 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet
* 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]]
* 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie
* 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie
* 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie
* 05:11 marostegui@dns1004: END - running authdns-update
* 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json
* 05:09 marostegui@dns1004: START - running authdns-update
* 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json
* 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json
* 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json
* 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s)
* 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]]
* 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s)
* 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001
* 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001
* 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]]
== 2026-05-05 ==
* 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s)
* 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]]
* 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s)
* 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]]
* 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s)
* 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]]
* 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s)
* 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]]
* 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s)
* 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]]
* 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s)
* 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment
* 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]]
* 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts
* 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s)
* 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s)
* 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment
* 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve
* 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]]
* 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s)
* 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie
* 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment
* 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]]
* 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002
* 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002
* 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox
* 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie
* 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts
* 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s)
* 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation
* 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s)
* 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]]
* 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s)
* 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw
* 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw
* 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad
* 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad
* 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003
* 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'"
* 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s)
* 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment
* 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb
* 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]]
* 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]]
* 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s)
* 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]]
* 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s)
* 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]]
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s)
* 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]]
* 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 15:39 dzahn@dns1005: END - running authdns-update
* 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore
* 15:37 dzahn@dns1005: START - running authdns-update
* 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s)
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json
* 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]]
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json
* 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s)
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]]
* 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s)
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json
* 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json
* 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]]
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json
* 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json
* 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad
* 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json
* 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet
* 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json
* 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw
* 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:03 Lucas_WMDE: UTC afternoon backport+config window done
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet
* 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json
* 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json
* 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling
* 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet
* 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s)
* 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json
* 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]]
* 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json
* 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling
* 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json
* 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json
* 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:30 Msz2001: UTC afternoon backport window done
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json
* 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet
* 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s)
* 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]]
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json
* 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json
* 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment
* 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]]
* 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s)
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json
* 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json
* 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s)
* 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]]
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json
* 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s)
* 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 12:42 moritzm: installing node-tar security updates
* 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]]
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json
* 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 12:36 moritzm: installing imagemagick security updates
* 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json
* 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json
* 12:04 moritzm: installing postgresql-13 security updates
* 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json
* 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s)
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json
* 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet
* 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]]
* 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s)
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json
* 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet
* 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet
* 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]]
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json
* 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json
* 11:10 moritzm: installing ca-certificates updates from bookworm point release
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie
* 11:07 moritzm: installing multipart bugfix updates from bookworm point release
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json
* 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie
* 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json
* 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json
* 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json
* 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json
* 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie
* 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json
* 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie
* 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json
* 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json
* 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie
* 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie
* 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json
* 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json
* 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json
* 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json
* 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie
* 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json
* 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json
* 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 08:50 moritzm: installing augeas security updates
* 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json
* 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json
* 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json
* 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json
* 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json
* 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]]
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie
* 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json
* 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]]
* 08:05 ayounsi@dns1004: END - running authdns-update
* 08:03 ayounsi@dns1004: START - running authdns-update
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json
* 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie
* 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json
* 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json
* 07:55 awight: EU morning deployment was fun
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json
* 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]]
* 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json
* 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]]
* 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]]
* 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]]
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie
* 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s)
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment
* 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]]
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie
* 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie
* 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie
* 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie
* 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s)
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s)
* 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]]
== 2026-05-04 ==
* 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]]
* 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s)
* 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment
* 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]]
* 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s)
* 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging
* 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s)
* 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]]
* 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s)
* 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment
* 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]]
* 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s)
* 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie
* 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment
* 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]]
* 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors
* 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting
* 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie
* 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s)
* 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]]
* 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s)
* 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]]
* 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s)
* 18:11 dancy@deploy1003: dancy: Rolling back deployment
* 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:09 dancy@deploy1003: Started scap sync-world: testing
* 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts
* 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s)
* 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s)
* 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]]
* 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json
* 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s)
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json
* 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment
* 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement
* 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]]
* 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json
* 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json
* 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json
* 15:10 papaul: ongoing switch refresh in ULSFO
* 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s)
* 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]]
* 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json
* 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts
* 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts
* 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json
* 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh
* 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh
* 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001
* 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json
* 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json
* 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s)
* 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable
* 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]]
* 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json
* 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s)
* 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment
* 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]]
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json
* 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json
* 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json
* 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json
* 13:13 moritzm: installing jaraco.context security updates
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json
* 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json
* 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json
* 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json
* 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json
* 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie
* 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json
* 10:48 moritzm: installing bash updates from trixie point release
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json
* 10:42 moritzm: installing postgresql-17 security updates
* 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json
* 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance
* 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json
* 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie
* 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json
* 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie
* 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json
* 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events
* 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events
* 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json
* 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json
* 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie
* 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s)
* 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync
* 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync
* 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts
* 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment
* 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie
* 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]]
* 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie
* 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie
* 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie
* 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic
* 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie
* 07:38 moritzm: installing Linux 6.12.85 on trixie hosts
* 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet
* 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie
* 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie
* 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie
* 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie
* 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie
* 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]]
* 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie
* 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie
* 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie
* 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-03 ==
* 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s)
* 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]]
* 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s)
* 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]]
== 2026-05-02 ==
* 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s)
* 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment
* 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]]
* 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s)
* 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment
* 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]]
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie
* 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie
* 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie
* 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie
* 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie
* 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s)
* 11:57 samtar@deploy1003: samtar: Continuing with deployment
* 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]]
* 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s)
* 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie
* 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie
* 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
== 2026-05-01 ==
* 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie
* 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie
* 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373
* 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372
* 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357
* 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie
* 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s)
* 20:02 krinkle@deploy1003: krinkle: Continuing with deployment
* 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]]
* 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s)
* 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]]
* 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts
* 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s)
* 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox
* 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie
* 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003
* 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie
* 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts
* 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s)
* 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
* 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts
* 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s)
* 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie
* 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 13:24 _Gerges: WikiMonitor setup
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s)
* 09:53 samtar@deploy1003: samtar: Continuing with deployment
* 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]]
* 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s)
* 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]]
* 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s)
* 00:13 zabe@deploy1003: zabe: Continuing with deployment
* 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
85p5ki9frf2vh4we636d09vmljb513i
2421409
2421404
2026-05-31T02:00:26Z
Stashbot
7414
mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2421409
wikitext
text/x-wiki
== 2026-05-31 ==
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== 2026-05-20 ==
* 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s)
* 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]]
* 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s)
* 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]]
* 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]])
* 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s)
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]]
* 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet
* 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet
* 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet
* 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet
* 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet
* 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet
* 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet
* 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet
* 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet
* 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs
* 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet
* 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet
* 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet
* 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet
* 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet
* 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet
* 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp
* 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet
* 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet
* 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
* 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet
* 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet
* 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet
* 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s)
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment
* 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet
* 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet
* 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]]
* 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet
* 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet
* 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
* 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
* 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
* 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet
* 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet
* 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet
* 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:16 dwisehaupt@dns1005: END - running authdns-update
* 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance
* 20:15 dwisehaupt@dns1005: START - running authdns-update
* 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet
* 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet
* 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet
* 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet
* 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet
* 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet
* 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet
* 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s)
* 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet
* 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet
* 19:27 ejegg@deploy1003: ejegg: Continuing with deployment
* 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]]
* 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet
* 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s)
* 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet
* 18:45 reedy@deploy1003: reedy: Continuing with deployment
* 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]]
* 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 18:38 dwisehaupt@dns1004: END - running authdns-update
* 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:36 dwisehaupt@dns1004: START - running authdns-update
* 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet
* 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
* 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
* 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet
* 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet
* 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet
* 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s)
* 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet
* 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet
* 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml
* 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet
* 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet
* 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
* 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet
* 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet
* 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
* 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
* 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
* 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image
* 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh
* 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
* 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
* 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet
* 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
* 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet
* 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet
* 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet
* 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet
* 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet
* 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw
* 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp
* 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed
* 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh
* 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s)
* 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment
* 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]]
* 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
* 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet
* 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet
* 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
* 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet
* 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet
* 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot
* 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie
* 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable.
* 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet
* 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet
* 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh
* 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 moritzm: installing rsync security updates
* 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
* 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts
* 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s)
* 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet
* 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002
* 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed
* 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet
* 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:43 Lucas_WMDE: UTC afternoon backport+config window done
* 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet
* 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s)
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:38 reedy@deploy1003: reedy: Continuing with deployment
* 13:38 moritzm: installing krb5 security updates
* 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056
* 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]]
* 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056
* 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
* 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:22 root@cumin1003: START - Cookbook sre.dns.netbox
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed
* 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183
* 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]]
* 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie
* 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97)
* 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie
* 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s)
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 13:11 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]]
* 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s)
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 12:57 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]]
* 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart
* 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie
* 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s)
* 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:22 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]]
* 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed
* 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s)
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]]
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed
* 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
* 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet
* 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet
* 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet
* 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet
* 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048
* 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet
* 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet
* 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet
* 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet
* 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet
* 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet
* 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed
* 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 slyngshede@dns1004: END - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 10:21 slyngshede@dns1004: START - running authdns-update
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet
* 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet
* 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet
* 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet
* 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot
* 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet
* 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]]
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet
* 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet
* 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed
* 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet
* 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet
* 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
* 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet
* 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad
* 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]]
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org
* 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389
* 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947
* 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet
* 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947
* 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie
* 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet
* 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet
* 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org
* 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org
* 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed
* 08:26 moritzm: installing Java 11 security updates
* 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie
* 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release)
* 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot
* 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot
* 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot
* 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot
* 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot
* 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
* 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot
* 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s)
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 07:19 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]]
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s)
* 07:11 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 07:09 moritzm: remove haveged
* 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]]
* 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet
* 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet
* 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet
* 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart
* 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1
* 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.*
* 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet
* 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.*
* 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s)
* 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]]
* 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s)
* 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]]
* 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
== 2026-05-19 ==
* 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72
* 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]]
* 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet
* 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet
* 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
* 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036
* 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036
* 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder
* 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s)
* 20:51 sbassett@deploy1003: sbassett: Continuing with deployment
* 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]]
* 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s)
* 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]]
* 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s)
* 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment
* 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]]
* 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s)
* 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 19:53 otto@deploy1003: otto: Continuing with deployment
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s)
* 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]]
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d]
* 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s)
* 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d]
* 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s)
* 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d]
* 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org
* 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]]
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d]
* 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s)
* 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d]
* 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart
* 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart
* 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]]
* 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed
* 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83
* 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet
* 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet
* 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s)
* 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]]
* 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s)
* 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart
* 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]]
* 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s)
* 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed
* 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie
* 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001
* 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
* 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]]
* 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s)
* 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
* 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s)
* 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment
* 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json
* 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json
* 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet
* 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie
* 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie
* 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet
* 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet
* 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet
* 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet
* 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet
* 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
* 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 13:37 Lucas_WMDE: UTC afternoon backport+config window done
* 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 13:32 cscott@deploy1003: cscott: Continuing with deployment
* 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]]
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s)
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]]
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:12 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]]
* 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie
* 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie
* 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie
* 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie
* 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie
* 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet
* 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet
* 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
* 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie
* 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie
* 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
* 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet
* 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie
* 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet
* 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie
* 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie
* 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet
* 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet
* 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet
* 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet
* 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet
* 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
* 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
* 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet
* 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet
* 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004
* 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001
* 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie
* 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie
* 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet
* 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie
* 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
* 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie
* 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie
* 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet
* 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie
* 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet
* 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet
* 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie
* 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet
* 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
* 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie
* 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
* 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie
* 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
* 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet
* 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie
* 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie
* 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie
* 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie
* 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed
* 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
* 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
* 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401
* 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401
* 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet
* 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
* 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed
* 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
* 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie
* 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s)
* 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
* 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet
* 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]]
* 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008
* 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s)
* 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie
* 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet
* 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet
* 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 08:40 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet
* 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]]
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]]
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]]
* 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 08:24 Emperor: reboot apus codfw frontends (May reboots)
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet
* 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover
* 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]]
* 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 07:57 Emperor: reboot apus eqiad frontends (May reboots)
* 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]]
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet
* 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
* 07:33 XioNoX: add gnmic 0.46.0 to reprepro
* 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s)
* 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover
* 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:14 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover
* 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]]
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
* 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json
* 06:54 moritzm: installing qemu security updates
* 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json
* 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]]
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json
* 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]]
* 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
* 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet
* 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json
* 06:19 fceratto@dns1005: END - running authdns-update
* 06:18 fceratto@dns1005: START - running authdns-update
* 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json
* 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json
* 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]]
* 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json
* 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]]
* 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s)
* 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s)
* 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]]
* 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s)
* 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment
* 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]]
== 2026-05-18 ==
* 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
* 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
* 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s)
* 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]]
* 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
* 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet
* 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet
* 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s)
* 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment
* 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]]
* 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
* 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
* 21:16 mutante: gerrit-replica.wikimedia.org back online
* 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends
* 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:48 jhathaway@dns1004: END - running authdns-update
* 18:46 jhathaway@dns1004: START - running authdns-update
* 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet
* 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
* 18:26 herron: rebooting alert1002
* 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet
* 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet
* 18:16 mutante: releases.wikimedia.org - rebooting backends
* 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
* 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]]
* 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
* 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
* 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:46 herron: rebooting alert2002
* 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet
* 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
* 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet
* 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet
* 17:37 mutante: stewards* - rebooting
* 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
* 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet
* 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet
* 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet
* 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet
* 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:14 mutante: doc.wikimedia.org - rebooting backends
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet
* 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams
* 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
* 17:11 mutante: etherpad - rebooting backends
* 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet
* 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 17:04 mutante: contint2002, phab2002 - rebooting
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:32 mutante: zuul[12]00[123] / zuul* - rebooting
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade
* 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:27 mutante: people.wikimedia.org backend - rebooting
* 16:22 mutante: contint1003 - rebooting
* 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet
* 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw
* 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]]
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s)
* 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
* 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
* 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s)
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover
* 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover
* 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover
* 14:13 mlitn@deploy1003: Rolling back deployment
* 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]])
* 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:08 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]]
* 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163
* 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s)
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment
* 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
* 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]]
* 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s)
* 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad]
* 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s)
* 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad]
* 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s)
* 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad]
* 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
* 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
* 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]]
* 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s)
* 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]]
* 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s)
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
* 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s)
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 13:00 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]]
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json
* 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch`
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json
* 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]]
* 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json
* 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]]
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
* 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
* 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye
* 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye
* 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
* 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 11:21 slyngshede@dns1004: END - running authdns-update
* 11:19 slyngshede@dns1004: START - running authdns-update
* 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org
* 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org
* 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org
* 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet
* 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:56 slyngshede@dns1004: END - running authdns-update
* 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye
* 10:54 slyngshede@dns1004: START - running authdns-update
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye
* 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
* 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
* 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json
* 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw
* 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json
* 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json
* 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json
* 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json
* 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]]
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json
* 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]]
* 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye
* 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]]
* 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:18 moritzm: installing Java 21 security updates
* 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]]
* 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts
* 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082
* 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s)
* 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 09:03 ayounsi@dns1004: END - running authdns-update
* 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s)
* 09:01 ayounsi@dns1004: START - running authdns-update
* 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet
* 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082
* 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye
* 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]]
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 08:12 moritzm: installing glibc bugfix updates from bookworm point release
* 07:46 moritzm: installing systemd bugfix updates from bookworm point release
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:35 moritzm: installing openssl bugfix updates from bookworm point release
* 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet
* 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json
* 06:59 moritzm: installing systemd bugfix updates from trixie point release
* 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts
* 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts
* 06:49 moritzm: installing glibc bugfix updates from trixie point release
* 06:44 moritzm: installing openssl bugfix updates from trixie point release
* 06:33 moritzm: installing Linux 6.12.88 on trixie hosts
* 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4
== 2026-05-15 ==
* 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s)
* 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment
* 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]]
* 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm
* 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts
* 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet
* 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065
* 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065
* 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]]
* 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye
* 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065
* 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]]
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064
* 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064
* 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json
* 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json
* 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json
* 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065
* 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064
* 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064
* 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
* 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)
* 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-14 ==
* 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s)
* 21:43 egardner@deploy1003: egardner: Continuing with deployment
* 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]]
* 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s)
* 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]]
* 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm
* 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment
* 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]]
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s)
* 20:46 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]]
* 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s)
* 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm
* 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm
* 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]]
* 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm
* 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s)
* 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment
* 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]]
* 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm
* 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286
* 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm
* 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm
* 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:10 cmooney@dns2005: END - running authdns-update
* 17:09 cmooney@dns2005: START - running authdns-update
* 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s)
* 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:12 bearloga@deploy1003: bearloga: Continuing with deployment
* 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]]
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm
* 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289
* 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s)
* 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285
* 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:29 phuedx@deploy1003: phuedx: Continuing with deployment
* 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]]
* 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json
* 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm
* 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json
* 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s)
* 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json
* 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]]
* 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json
* 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s)
* 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm
* 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:56 mfossati@deploy1003: mfossati: Continuing with deployment
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]]
* 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json
* 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s)
* 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:45 krinkle@deploy1003: krinkle: Continuing with deployment
* 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm
* 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]]
* 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s)
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned
* 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned
* 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm
* 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]]
* 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283
* 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283
* 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover
* 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282
* 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]]
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json
* 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json
* 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281
* 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]]
* 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]])
* 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280
* 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json
* 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]]
* 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458
* 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458
* 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json
* 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json
* 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json
* 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye
* 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json
* 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye
* 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye
* 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned
* 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned
* 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned
* 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye
* 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye
* 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye
* 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye
* 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye
* 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json
* 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]])
* 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3
* 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie
* 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie
* 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
== 2026-05-13 ==
* 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s)
* 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]]
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s)
* 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]]
* 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s)
* 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]]
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]]
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s)
* 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment
* 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]]
* 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:20 cmooney@dns2005: END - running authdns-update
* 18:19 cmooney@dns2005: START - running authdns-update
* 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure
* 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul
* 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
* 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
* 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]])
* 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:16 cmooney@dns2005: END - running authdns-update
* 15:15 cmooney@dns2005: START - running authdns-update
* 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie
* 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s)
* 14:37 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]]
* 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s)
* 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:15 jforrester@deploy1003: jforrester: Continuing with deployment
* 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]]
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:08 Lucas_WMDE: UTC afternoon backport+config window done
* 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}}
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment
* 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}}
* 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie
* 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply
* 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}}
* 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s)
* 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]]
* {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}}
* 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment
* {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}}
* 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}}
* 13:25 moritzm: installing openjdk-11 security updates
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s)
* 13:07 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw
* 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]]
* 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s)
* 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]]
* 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
* 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]])
* 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]]
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]]
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie
* 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed
* 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie
* 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]]
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]]
* 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts
* 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie
* 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie
* 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie
* 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie
* 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts
* 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]]
* 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie
* 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:10 moritzm: installing Apache security updates on Bullseye
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie
* 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye
* 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie
* 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json
* 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye
* 09:56 moritzm: installing distro-info-data updates from Bookworm point release
* 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]]
* 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]]
* 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye
* 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json
* 09:51 moritzm: installing ca-certificates update from Bookworm point release
* 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye
* 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s)
* 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:41 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]]
* 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:28 cmooney@dns2005: END - running authdns-update
* 09:27 cmooney@dns2005: START - running authdns-update
* 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet
* 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage
* 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye
* 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye
* 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 moritzm: installing dnsmasq security updates
* 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 08:38 cmooney@dns2005: END - running authdns-update
* 08:37 cmooney@dns2005: START - running authdns-update
* 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s)
* 08:20 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]]
* 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]]
* 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s)
* 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment
* 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]]
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s)
* 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment
* 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie
* 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay
* 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie
* 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie
* 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie
* 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie
* 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie
* 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm
* 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277
* 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm
* 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276
* 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm
* 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s)
* 01:28 zabe@deploy1003: zabe: Continuing with deployment
* 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]]
* 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274
* 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-05-12 ==
* 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s)
* 23:40 cscott@deploy1003: cscott: Continuing with deployment
* 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]]
* 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s)
* 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm
* 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]]
* 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s)
* 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 dwisehaupt@dns1004: END - running authdns-update
* 21:57 dwisehaupt@dns1004: START - running authdns-update
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]]
* 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273
* 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm
* 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273
* 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment
* 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]]
* 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s)
* 21:15 cscott@deploy1003: cscott: Continuing with deployment
* 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change
* 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]]
* 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm
* 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s)
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment
* 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]]
* 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s)
* 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:20 dbrant@deploy1003: dbrant: Continuing with deployment
* 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]]
* 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s)
* 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]]
* 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s)
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment
* 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]]
* 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)
* 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:52 otto@deploy1003: otto: Continuing with deployment
* 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]
* 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 16:25 moritzm: installing Exim security updates on lists/vrts hosts
* 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s)
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]]
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]]
* 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s)
* 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye
* 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye
* 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye
* 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001
* 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001
* 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 14:15 Lucas_WMDE: UTC afternoon backport+config window done
* 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s)
* 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271
* 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]]
* 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye
* 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s)
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271
* 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272
* 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272
* 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]]
* 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]]
* 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot
* 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot
* 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s)
* 13:09 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]]
* 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}}
* 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced
* {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}}
* 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s)
* 12:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]]
* 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s)
* 09:51 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]]
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie
* 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie
* 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s)
* 08:00 dcausse@deploy1003: dcausse: Rolling back deployment
* 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie
* 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s)
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie
* 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet
* 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]]
* 06:27 jayme@dns1004: END - running authdns-update
* 06:26 jayme@dns1004: START - running authdns-update
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s)
* 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]]
== 2026-05-11 ==
* 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s)
* 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]]
* 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s)
* 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]]
* 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s)
* 21:47 cjming@deploy1003: cjming: Continuing with deployment
* 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]]
* 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]]
* 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment
* 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]]
* 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s)
* 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]]
* 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s)
* 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm
* 19:58 zabe@deploy1003: zabe: Continuing with deployment
* 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]]
* 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts
* 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper
* 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269
* 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:16 dzahn@dns1005: END - running authdns-update
* 19:14 dzahn@dns1005: START - running authdns-update
* 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space
* 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]]
* 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json
* 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268
* 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json
* 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json
* 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts
* 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s)
* 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json
* 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json
* 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s)
* 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:23 zabe@deploy1003: zabe: Continuing with deployment
* 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]]
* 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s)
* 15:54 zabe@deploy1003: zabe: Continuing with deployment
* 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s)
* 15:42 zabe@deploy1003: zabe: Continuing with deployment
* 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]]
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017
* 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017
* 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 Lucas_WMDE: UTC afternoon backport+config window done
* 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18
* 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment
* {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}}
* 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]]
* {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}}
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment
* {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}}
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
* 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw
* 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}}
* 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]]
* 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s)
* 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
* 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie
* 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]]
* 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)
* 13:06 elukey: remove old discovery pki intermediate
* 13:03 otto@deploy1003: otto: Continuing with deployment
* 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]
* 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s)
* 12:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]]
* 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]])
* 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot
* 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts
* 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s)
* 11:21 jayme@deploy1003: Rolling back deployment
* 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]]
* 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]]
* 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts
* 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s)
* 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image
* 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:16 slyngs: Migrate of lvs2012 due to hardware issues
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s)
* 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]]
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]]
* 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]]
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json
* 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]]
* 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json
* 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 08:10 slyngshede@dns1004: END - running authdns-update
* 08:08 slyngshede@dns1004: START - running authdns-update
* 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors
* 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet
* 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org
* 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org
* 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-10 ==
* 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-09 ==
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm
* 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-08 ==
* 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267
* 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm
* 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266
* 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm
* 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265
* 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/
* 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health
* 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps
* 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad
* 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart
* 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json
* 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json
* 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json
* 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie
* 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org
* 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie
* 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie
* 06:11 moritzm: installing postorius security updates
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie
* 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024
* 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023
* 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox
== 2026-05-07 ==
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s)
* 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19]
* 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s)
* 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19]
* 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s)
* 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19]
* {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}}
* 21:23 cscott@deploy1003: cscott: Continuing with deployment
* 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t
* {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}}
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s)
* 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment
* 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]]
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki
* 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s)
* 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]]
* 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie
* 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022
* 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:06 cdanis@dns1005: END - running authdns-update
* 18:04 cdanis@dns1005: START - running authdns-update
* 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s)
* 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis
* 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 17:51 krinkle@deploy1003: krinkle: Continuing with deployment
* 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]]
* 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart
* 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 16:32 jynus: restarting backup1-* database primary hosts
* 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart
* 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart
* 16:14 sukhe@dns1004: END - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
* 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
* 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet
* 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs
* 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
* 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s)
* 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c]
* 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s)
* 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
* 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c]
* 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s)
* 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:32 slyngshede@dns1004: END - running authdns-update
* 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c]
* 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 14:30 slyngshede@dns1004: START - running authdns-update
* 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train
* 14:30 jmm@dns1004: END - running authdns-update
* 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 14:28 jmm@dns1004: START - running authdns-update
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox
* 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
* 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
* 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s)
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]]
* 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s)
* 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment
* 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]]
* 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 12:45 sukhe@dns1004: FAIL - running authdns-update
* 12:44 sukhe@dns1004: START - running authdns-update
* 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm
* 12:23 slyngshede@dns1004: FAIL - running authdns-update
* 12:21 slyngshede@dns1004: START - running authdns-update
* 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:12 slyngshede@dns1004: FAIL - running authdns-update
* 12:11 slyngshede@dns1004: START - running authdns-update
* 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie
* 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie
* 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage
* 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie
* 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie
* 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json
* 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie
* 11:11 moritzm: instaling modsecurity-apache security updates
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s)
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:54 root@cumin1003: START - Cookbook sre.dns.netbox
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json
* 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184
* 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie
* 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]]
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json
* 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage
* 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json
* 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie
* 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie
* 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie
* 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie
* 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie
* 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org
* 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]]
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie
* 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org
* 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie
* 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie
* 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie
* 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie
* 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org
* 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie
* 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json
* 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage
* 08:23 XioNoX: drmrs remove old v6 gateway IP
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s)
* 07:49 dcausse@deploy1003: dcausse: Continuing with deployment
* 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]]
* 07:32 moritzm: installing apache2 security updates
* 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie
* 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie
* 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie
* 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie
* 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json
* 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json
* 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]]
* 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]]
* 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json
* 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s)
* 01:09 zabe@deploy1003: zabe: Continuing with deployment
* 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]]
* 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s)
* 00:31 zabe@deploy1003: zabe: Continuing with deployment
* 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]]
== 2026-05-06 ==
* 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s)
* 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s)
* 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s)
* 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]]
* 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s)
* 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]]
* 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s)
* 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:10 cjming@deploy1003: cjming: Continuing with deployment
* 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]]
* 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s)
* 21:48 zabe@deploy1003: zabe: Continuing with deployment
* 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]]
* 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s)
* 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment
* 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]]
* 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie
* 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm
* 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:37 dzahn@dns1005: END - running authdns-update
* 18:35 dzahn@dns1005: START - running authdns-update
* 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1
* 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm
* 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work
* 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie
* 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
* 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm
* 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'"
* 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s)
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]]
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:26 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'"
* 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]]
* 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s)
* 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]]
* 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s)
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie
* 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]]
* 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:45 jgreen@dns1004: END - running authdns-update
* 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s)
* 13:44 jgreen@dns1004: START - running authdns-update
* 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors
* 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors
* 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]]
* 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie
* 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie
* 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie
* 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie
* 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie
* 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie
* 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie
* 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s)
* 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie
* 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]]
* 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:50 moritzm: installing openjdk-17 security updates
* 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie
* 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie
* 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie
* 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json
* 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json
* 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json
* 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json
* 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json
* 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update
* 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json
* 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json
* 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s)
* 08:59 zabe@deploy1003: zabe: Continuing with deployment
* 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]]
* 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json
* 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json
* 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:06 awight: EU morning deployment is done
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s)
* 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment
* 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
* 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]]
* 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s)
* 07:22 awight@deploy1003: awight, lilients: Continuing with deployment
* 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]]
* 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie
* 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet
* 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]]
* 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie
* 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie
* 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie
* 05:11 marostegui@dns1004: END - running authdns-update
* 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json
* 05:09 marostegui@dns1004: START - running authdns-update
* 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json
* 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json
* 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json
* 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s)
* 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]]
* 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s)
* 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001
* 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001
* 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]]
== 2026-05-05 ==
* 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s)
* 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]]
* 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s)
* 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]]
* 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s)
* 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]]
* 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s)
* 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]]
* 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s)
* 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]]
* 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s)
* 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment
* 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]]
* 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts
* 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s)
* 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s)
* 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment
* 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve
* 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]]
* 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s)
* 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie
* 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment
* 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]]
* 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002
* 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002
* 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox
* 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie
* 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts
* 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s)
* 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation
* 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s)
* 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]]
* 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s)
* 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw
* 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw
* 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad
* 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad
* 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003
* 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'"
* 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s)
* 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment
* 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb
* 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]]
* 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]]
* 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s)
* 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]]
* 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s)
* 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]]
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s)
* 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]]
* 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 15:39 dzahn@dns1005: END - running authdns-update
* 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore
* 15:37 dzahn@dns1005: START - running authdns-update
* 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s)
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json
* 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]]
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json
* 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s)
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]]
* 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s)
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json
* 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json
* 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]]
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json
* 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json
* 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad
* 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json
* 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet
* 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json
* 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw
* 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:03 Lucas_WMDE: UTC afternoon backport+config window done
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet
* 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json
* 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json
* 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling
* 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet
* 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s)
* 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json
* 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]]
* 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json
* 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling
* 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json
* 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json
* 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:30 Msz2001: UTC afternoon backport window done
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json
* 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet
* 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s)
* 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]]
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json
* 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json
* 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment
* 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]]
* 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s)
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json
* 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json
* 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s)
* 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]]
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json
* 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s)
* 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 12:42 moritzm: installing node-tar security updates
* 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]]
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json
* 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 12:36 moritzm: installing imagemagick security updates
* 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json
* 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json
* 12:04 moritzm: installing postgresql-13 security updates
* 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json
* 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s)
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json
* 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet
* 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]]
* 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s)
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json
* 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet
* 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet
* 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]]
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json
* 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json
* 11:10 moritzm: installing ca-certificates updates from bookworm point release
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie
* 11:07 moritzm: installing multipart bugfix updates from bookworm point release
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json
* 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie
* 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json
* 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json
* 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json
* 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json
* 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie
* 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json
* 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie
* 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json
* 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json
* 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie
* 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie
* 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json
* 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json
* 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json
* 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json
* 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie
* 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json
* 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json
* 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 08:50 moritzm: installing augeas security updates
* 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json
* 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json
* 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json
* 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json
* 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json
* 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]]
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie
* 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json
* 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]]
* 08:05 ayounsi@dns1004: END - running authdns-update
* 08:03 ayounsi@dns1004: START - running authdns-update
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json
* 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie
* 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json
* 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json
* 07:55 awight: EU morning deployment was fun
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json
* 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]]
* 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json
* 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]]
* 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]]
* 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]]
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie
* 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s)
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment
* 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]]
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie
* 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie
* 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie
* 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie
* 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s)
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s)
* 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]]
== 2026-05-04 ==
* 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]]
* 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s)
* 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment
* 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]]
* 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s)
* 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging
* 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s)
* 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]]
* 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s)
* 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment
* 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]]
* 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s)
* 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie
* 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment
* 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]]
* 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors
* 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting
* 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie
* 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s)
* 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]]
* 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s)
* 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]]
* 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s)
* 18:11 dancy@deploy1003: dancy: Rolling back deployment
* 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:09 dancy@deploy1003: Started scap sync-world: testing
* 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts
* 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s)
* 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s)
* 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]]
* 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json
* 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s)
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json
* 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment
* 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement
* 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]]
* 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json
* 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json
* 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json
* 15:10 papaul: ongoing switch refresh in ULSFO
* 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s)
* 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]]
* 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json
* 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts
* 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts
* 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json
* 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh
* 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh
* 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001
* 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json
* 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json
* 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s)
* 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable
* 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]]
* 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json
* 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s)
* 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment
* 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]]
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json
* 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json
* 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json
* 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json
* 13:13 moritzm: installing jaraco.context security updates
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json
* 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json
* 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json
* 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json
* 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json
* 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie
* 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json
* 10:48 moritzm: installing bash updates from trixie point release
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json
* 10:42 moritzm: installing postgresql-17 security updates
* 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json
* 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance
* 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json
* 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie
* 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json
* 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie
* 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json
* 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events
* 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events
* 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json
* 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json
* 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie
* 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s)
* 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync
* 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync
* 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts
* 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment
* 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie
* 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]]
* 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie
* 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie
* 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie
* 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic
* 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie
* 07:38 moritzm: installing Linux 6.12.85 on trixie hosts
* 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet
* 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie
* 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie
* 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie
* 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie
* 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie
* 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]]
* 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie
* 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie
* 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie
* 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-03 ==
* 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s)
* 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]]
* 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s)
* 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]]
== 2026-05-02 ==
* 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s)
* 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment
* 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]]
* 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s)
* 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment
* 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]]
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie
* 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie
* 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie
* 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie
* 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie
* 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s)
* 11:57 samtar@deploy1003: samtar: Continuing with deployment
* 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]]
* 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s)
* 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie
* 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie
* 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
== 2026-05-01 ==
* 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie
* 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie
* 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373
* 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372
* 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357
* 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie
* 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s)
* 20:02 krinkle@deploy1003: krinkle: Continuing with deployment
* 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]]
* 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s)
* 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]]
* 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts
* 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s)
* 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox
* 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie
* 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003
* 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie
* 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts
* 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s)
* 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
* 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts
* 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s)
* 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie
* 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 13:24 _Gerges: WikiMonitor setup
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s)
* 09:53 samtar@deploy1003: samtar: Continuing with deployment
* 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]]
* 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s)
* 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]]
* 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s)
* 00:13 zabe@deploy1003: zabe: Continuing with deployment
* 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
cpfvjo79tgenlp2xhoe36fs9vl7i9p5
2421410
2421409
2026-05-31T02:06:56Z
Stashbot
7414
mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
2421410
wikitext
text/x-wiki
== 2026-05-31 ==
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 30s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-30 ==
* 16:21 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:21 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 06:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 06:38 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 27s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-29 ==
* 23:39 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 23:37 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 21:42 catrope@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 21:41 catrope@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] (duration: 06m 54s)
* 17:35 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 17:34 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:33 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1295487{{!}}Hide experiment if not active and no assigned group]]
* 16:30 jgreen@dns1004: END - running authdns-update
* 16:28 jgreen@dns1004: START - running authdns-update
* 16:13 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:12 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:28 dancy@deploy1003: Installation of scap version "4.267.0" completed for 2 hosts
* 15:26 dancy@deploy1003: Installing scap version "4.267.0" for 2 host(s)
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:15 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] (duration: 07m 58s)
* 14:11 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:09 kharlan@deploy1003: kharlan: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1295466{{!}}GlobalPreferencesHandler: Cast auto-reveal expiry to int (T427625)]]
* 13:53 moritzm: imported OpenJDK 21 21.0.11+10-1~deb12u1 to component/jdk21 (backport of latest Java 21 security release for Bookworm)
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1006.wikimedia.org
* 12:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1006.wikimedia.org with OS trixie
* 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1006.wikimedia.org with reason: host reimage
* 11:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1006.wikimedia.org with OS trixie
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:13 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1006.wikimedia.org on all recursors
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1006.wikimedia.org - jmm@cumin2002"
* 11:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1006.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1005.wikimedia.org
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader1005.wikimedia.org with OS trixie
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Pooling
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader1005.wikimedia.org with reason: host reimage
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader1005.wikimedia.org with OS trixie
* 10:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:55 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 09:50 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrade GitLab
* 09:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:44 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2014.codfw.wmnet with OS bookworm
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:20 jynus@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:12 jynus@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2014.codfw.wmnet with reason: host reimage
* 09:10 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 09:03 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad2002.codfw.wmnet
* 08:59 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad2002.codfw.wmnet
* 08:59 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad2002.codfw.wmnet - [[phab:T427588|T427588]]
* 08:54 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:51 jelto@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM etherpad1004.eqiad.wmnet
* 08:50 atsuko@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 08:50 jynus@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host backup2014.codfw.wmnet with OS bookworm
* 08:49 atsuko@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 08:47 jelto@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM etherpad1004.eqiad.wmnet
* 08:46 jelto: gnt-instance modify -B memory=4g,vcpus=1 etherpad1004.eqiad.wmnet - [[phab:T427588|T427588]]
* 08:42 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:39 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 08:39 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 08:38 atsuko@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 08:37 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 08:36 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 08:33 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 08:31 jynus@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host backup2014.codfw.wmnet with OS bookworm
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1005.wikimedia.org on all recursors
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1005.wikimedia.org - jmm@cumin2002"
* 08:18 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1005.wikimedia.org
* 08:05 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2212: Pooling
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Pooling
* 07:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2212.codfw.wmnet
* 07:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2212.codfw.wmnet
* 07:22 jynus@cumin2002: START - Cookbook sre.hosts.reimage for host backup2014.codfw.wmnet with OS bookworm
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2006.wikimedia.org
* 07:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2006.wikimedia.org with OS trixie
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2006.wikimedia.org with reason: host reimage
* 06:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2006.wikimedia.org with OS trixie
* 06:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2006.wikimedia.org on all recursors
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2006.wikimedia.org - jmm@cumin2002"
* 06:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2006.wikimedia.org
* 03:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 03:00 vriley@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts db1224.eqiad.wmnet
* 02:56 vriley@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts db1224.eqiad.wmnet
* 01:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 01:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 00:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 00:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp5032.eqsin.wmnet
* 00:23 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:22 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 00:21 amastilovic@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
== 2026-05-28 ==
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new ae1.522 interface - pt1979@cumin2002"
* 23:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:34 andrewbogott: reprepro includedeb trixie-wikimedia /home/andrew/magnum-cluster-api_0.36.6-1~wmf13u2_amd64.deb
* 22:31 logmsgbot: dreamyjazz Deployed security patch for [[phab:T426388|T426388]]
* 21:33 maryum: Deployed security fix for [[phab:T426867|T426867]]
* 21:21 alexsanford: Deployed security fix for [[phab:T426889|T426889]]
* 21:07 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cp5032.eqsin.wmnet
* 21:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 21:04 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "setup new eqsin vlan - pt1979@cumin2002 - [[phab:T427393|T427393]]"
* 20:48 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] (duration: 07m 34s)
* 20:44 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:43 arlolra@deploy1003: arlolra: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:41 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1295066{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T420336 T427098 T427354 T427082)]], [[gerrit:1295067{{!}}Bump wikimedia/parsoid to 0.24.0-a6 (T427082)]]
* 20:34 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] (duration: 07m 20s)
* 20:30 arlolra@deploy1003: arlolra: Continuing with deployment
* 20:29 arlolra@deploy1003: arlolra: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:27 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1293805{{!}}Deploy PRV to 7 wikis (T427331)]]
* 20:22 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] (duration: 09m 07s)
* 20:18 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Continuing with deployment
* 20:14 stran@deploy1003: alexsanford, stran, catrope, dreamyjazz: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]] synced to the testservers (see https://wikitech.
* 20:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5032.eqsin.wmnet with OS trixie
* 20:13 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1291996{{!}}Replace deprecated Hooks::getInstance (T426981)]], [[gerrit:1294393{{!}}Permissions: Create wmf-officeit group on officewiki]], [[gerrit:1294229{{!}}Deploy IRS Direct Reporting feature to enwiki (T427369)]], [[gerrit:1295039{{!}}Add 2FA enforcement demotion config for phase 2 groups (T423119)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1018.eqiad.wmnet
* 19:27 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1018.eqiad.wmnet
* 19:09 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: Kernel reboot
* 19:09 brett: Stopping pybal/puppet/downtiming lvs1018.eqiad.wmnet for reboot
* 19:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
* 18:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:51 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change cp5032 IP - pt1979@cumin2002"
* 18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:40 mutante: planet1003/planet2003 - apt-get upgrade - all pending package upgrades
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: Kernel reboot
* 18:34 brett: Stopping pybal/puppet/downtiming lvs1019.eqiad.wmnet for reboot and BIOS update/memory self-healing - [[phab:T426109|T426109]]
* 18:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 18:19 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Kernel reboot
* 18:19 brett: Stopping pybal/puppet/downtiming lvs2011.codfw.wmnet for reboot
* 18:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 18:06 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 18:00 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: Kernel reboot
* 17:57 brett: Stopping pybal/puppet/downtiming lvs2013.codfw.wmnet for reboot
* 17:19 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93393 and previous config saved to /var/cache/conftool/dbconfig/20260528-164514-fceratto.json
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93392 and previous config saved to /var/cache/conftool/dbconfig/20260528-163507-fceratto.json
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P93391 and previous config saved to /var/cache/conftool/dbconfig/20260528-162459-fceratto.json
* 16:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db1224.eqiad.wmnet with reason: unreachable [[phab:T427535|T427535]]
* 16:17 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:17 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:16 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:15 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93390 and previous config saved to /var/cache/conftool/dbconfig/20260528-161452-fceratto.json
* 16:14 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:14 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:09 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T426633|T426633]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260528-160646-fceratto.json
* 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93388 and previous config saved to /var/cache/conftool/dbconfig/20260528-160613-fceratto.json
* 15:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93387 and previous config saved to /var/cache/conftool/dbconfig/20260528-155605-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P93386 and previous config saved to /var/cache/conftool/dbconfig/20260528-154557-fceratto.json
* 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93385 and previous config saved to /var/cache/conftool/dbconfig/20260528-153550-fceratto.json
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93384 and previous config saved to /var/cache/conftool/dbconfig/20260528-152736-fceratto.json
* 15:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 15:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93383 and previous config saved to /var/cache/conftool/dbconfig/20260528-152708-fceratto.json
* 15:20 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp5032.eqsin.wmnet with reason: Testing reimaging on new subnet
* 15:18 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5032.*
* 15:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93382 and previous config saved to /var/cache/conftool/dbconfig/20260528-151701-fceratto.json
* 15:17 jhathaway: dmarc ingress test on mx-in1001
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P93381 and previous config saved to /var/cache/conftool/dbconfig/20260528-150653-fceratto.json
* 14:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93380 and previous config saved to /var/cache/conftool/dbconfig/20260528-145646-fceratto.json
* 14:56 moritzm: installing nginx security updates
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93379 and previous config saved to /var/cache/conftool/dbconfig/20260528-144936-fceratto.json
* 14:49 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93378 and previous config saved to /var/cache/conftool/dbconfig/20260528-144909-fceratto.json
* 14:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader2005.wikimedia.org
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host urldownloader2005.wikimedia.org with OS trixie
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:39 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2189.codfw.wmnet
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93377 and previous config saved to /var/cache/conftool/dbconfig/20260528-143901-fceratto.json
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P93376 and previous config saved to /var/cache/conftool/dbconfig/20260528-142854-fceratto.json
* 14:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on urldownloader2005.wikimedia.org with reason: host reimage
* 14:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] (duration: 11m 29s)
* 14:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93375 and previous config saved to /var/cache/conftool/dbconfig/20260528-141846-fceratto.json
* 14:15 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93374 and previous config saved to /var/cache/conftool/dbconfig/20260528-141029-fceratto.json
* 14:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 14:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host urldownloader2005.wikimedia.org with OS trixie
* 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93373 and previous config saved to /var/cache/conftool/dbconfig/20260528-141001-fceratto.json
* 14:09 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294998{{!}}ImageContentLookup: Fix issue created by strict types (T427505)]], [[gerrit:1295001{{!}}Enable hCaptcha for VisualEditor in group 1 (T425940)]]
* 14:00 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93371 and previous config saved to /var/cache/conftool/dbconfig/20260528-135951-fceratto.json
* 13:58 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=cp6015.drmrs.wmnet,service=(cdn{{!}}ats-be)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader2005.wikimedia.org on all recursors
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader2005.wikimedia.org - jmm@cumin2002"
* 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P93370 and previous config saved to /var/cache/conftool/dbconfig/20260528-134944-fceratto.json
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93369 and previous config saved to /var/cache/conftool/dbconfig/20260528-133936-fceratto.json
* 13:39 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:38 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:36 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] (duration: 06m 40s)
* 13:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93368 and previous config saved to /var/cache/conftool/dbconfig/20260528-133230-fceratto.json
* 13:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 13:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93367 and previous config saved to /var/cache/conftool/dbconfig/20260528-133202-fceratto.json
* 13:31 mlitn@deploy1003: mlitn: Continuing with deployment
* 13:31 mlitn@deploy1003: mlitn: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:29 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1294986{{!}}Image Carousel: check candidate pages (T427336)]]
* 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93366 and previous config saved to /var/cache/conftool/dbconfig/20260528-132155-fceratto.json
* 13:21 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 13:17 elukey: clean up a lof ot stale Kafka ACLs on Kafka Jumbo - Details in [[phab:T425528|T425528]]
* 13:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader2005.wikimedia.org
* 13:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P93365 and previous config saved to /var/cache/conftool/dbconfig/20260528-131147-fceratto.json
* 13:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93364 and previous config saved to /var/cache/conftool/dbconfig/20260528-130139-fceratto.json
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93363 and previous config saved to /var/cache/conftool/dbconfig/20260528-125439-fceratto.json
* 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 12:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93362 and previous config saved to /var/cache/conftool/dbconfig/20260528-125412-fceratto.json
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93361 and previous config saved to /var/cache/conftool/dbconfig/20260528-124404-fceratto.json
* 12:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:43 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:39 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:38 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P93360 and previous config saved to /var/cache/conftool/dbconfig/20260528-123357-fceratto.json
* 12:25 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93359 and previous config saved to /var/cache/conftool/dbconfig/20260528-122349-fceratto.json
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93358 and previous config saved to /var/cache/conftool/dbconfig/20260528-121551-fceratto.json
* 12:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 12:15 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 12:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93357 and previous config saved to /var/cache/conftool/dbconfig/20260528-121523-fceratto.json
* 12:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93356 and previous config saved to /var/cache/conftool/dbconfig/20260528-120515-fceratto.json
* 12:02 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1006.eqiad.wmnet with OS trixie
* 12:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 12:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 12:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P93355 and previous config saved to /var/cache/conftool/dbconfig/20260528-115508-fceratto.json
* 11:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93354 and previous config saved to /var/cache/conftool/dbconfig/20260528-114500-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93353 and previous config saved to /var/cache/conftool/dbconfig/20260528-113635-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93352 and previous config saved to /var/cache/conftool/dbconfig/20260528-113559-fceratto.json
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93351 and previous config saved to /var/cache/conftool/dbconfig/20260528-112551-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P93350 and previous config saved to /var/cache/conftool/dbconfig/20260528-111543-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93349 and previous config saved to /var/cache/conftool/dbconfig/20260528-110536-fceratto.json
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93348 and previous config saved to /var/cache/conftool/dbconfig/20260528-105820-fceratto.json
* 10:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1006.eqiad.wmnet with OS trixie
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93347 and previous config saved to /var/cache/conftool/dbconfig/20260528-105753-fceratto.json
* 10:56 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:55 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:50 moritzm: update trixie netboot image for 13.5 point release [[phab:T427072|T427072]]
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93346 and previous config saved to /var/cache/conftool/dbconfig/20260528-104745-fceratto.json
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P93345 and previous config saved to /var/cache/conftool/dbconfig/20260528-103738-fceratto.json
* 10:29 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P13724 # [[phab:T406971|T406971]]
* 10:28 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P14223 # [[phab:T422264|T422264]]
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93344 and previous config saved to /var/cache/conftool/dbconfig/20260528-102730-fceratto.json
* 10:26 arthurtaylor@deploy1003: mwscript-k8s job started: extensions/Wikibase/repo/maintenance/changePropertyDataType.php --wiki wikidatawiki --new-data-type external-id --property-id P1748 # [[phab:T422392|T422392]]
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93343 and previous config saved to /var/cache/conftool/dbconfig/20260528-101900-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93342 and previous config saved to /var/cache/conftool/dbconfig/20260528-101829-fceratto.json
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93341 and previous config saved to /var/cache/conftool/dbconfig/20260528-100822-fceratto.json
* 09:59 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] (duration: 06m 41s)
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P93340 and previous config saved to /var/cache/conftool/dbconfig/20260528-095814-fceratto.json
* 09:55 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 09:54 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:52 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1290687{{!}}stream: webrequest.page_view (T426092 T426091)]]
* 09:48 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] (duration: 07m 37s)
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93339 and previous config saved to /var/cache/conftool/dbconfig/20260528-094807-fceratto.json
* 09:44 dreamyjazz@deploy1003: dreamyjazz, stran: Continuing with deployment
* 09:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:42 dreamyjazz@deploy1003: dreamyjazz, stran: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294243{{!}}Set minimum edit count for skipcaptcha right to 10 (T426973)]], [[gerrit:1294937{{!}}CheckUserLookupUtils: Fix error introduced by strict types (T427480)]]
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93338 and previous config saved to /var/cache/conftool/dbconfig/20260528-093920-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93337 and previous config saved to /var/cache/conftool/dbconfig/20260528-093849-fceratto.json
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93336 and previous config saved to /var/cache/conftool/dbconfig/20260528-092842-fceratto.json
* 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93335 and previous config saved to /var/cache/conftool/dbconfig/20260528-092239-fceratto.json
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki-root1001.eqiad.wmnet
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:22 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:22 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki-root1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 09:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P93334 and previous config saved to /var/cache/conftool/dbconfig/20260528-091834-fceratto.json
* 09:18 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:18 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: Reboot completed
* 09:17 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:17 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:13 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:13 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki-root1001.eqiad.wmnet
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93332 and previous config saved to /var/cache/conftool/dbconfig/20260528-091231-fceratto.json
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93331 and previous config saved to /var/cache/conftool/dbconfig/20260528-090826-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P93329 and previous config saved to /var/cache/conftool/dbconfig/20260528-090224-fceratto.json
* 09:02 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod (duration: 02m 31s)
* 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93328 and previous config saved to /var/cache/conftool/dbconfig/20260528-090114-fceratto.json
* 09:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 09:00 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a] (duration: 02m 08s)
* 08:59 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Deploying to prod
* 08:58 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (thin): Regular analytics weekly train THIN - 2[analytics/refinery@878cb24a]
* 08:57 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host (duration: 00m 53s)
* 08:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@6200ab1] (releasing): [[phab:T427406|T427406]] Testing on backup host
* 08:56 joal@deploy1003: Finished deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a] (duration: 06m 54s)
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93327 and previous config saved to /var/cache/conftool/dbconfig/20260528-085216-fceratto.json
* 08:50 XioNoX: cr1-codfw# delete protocols bgp group fundraising family inet6 - [[phab:T423384|T423384]]
* 08:49 joal@deploy1003: Started deploy [analytics/refinery@878cb24]: Regular analytics weekly train - 2 [analytics/refinery@878cb24a]
* 08:49 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] (duration: 09m 20s)
* 08:49 joal@deploy1003: Finished deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a] (duration: 02m 00s)
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P93326 and previous config saved to /var/cache/conftool/dbconfig/20260528-084906-fceratto.json
* 08:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 08:48 slyngshede@dns1004: END - running authdns-update
* 08:47 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1165: Reboot completed
* 08:47 joal@deploy1003: Started deploy [analytics/refinery@878cb24] (hadoop-test): Regular analytics weekly train TEST -2 [analytics/refinery@878cb24a]
* 08:47 slyngs: Upgrade IDP to CAS 7.3.7.1
* 08:46 slyngshede@dns1004: START - running authdns-update
* 08:45 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93324 and previous config saved to /var/cache/conftool/dbconfig/20260528-084149-fceratto.json
* 08:41 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:40 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1294925{{!}}hCaptcha: Regenerate VisualEditor captcha token per save attempt (T427334)]]
* 08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 08:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93323 and previous config saved to /var/cache/conftool/dbconfig/20260528-083504-fceratto.json
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 08:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93322 and previous config saved to /var/cache/conftool/dbconfig/20260528-083331-fceratto.json
* 08:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1209: Test
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93320 and previous config saved to /var/cache/conftool/dbconfig/20260528-082324-fceratto.json
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: repool after crash
* 08:17 slyngshede@dns1004: END - running authdns-update
* 08:16 slyngshede@dns1004: START - running authdns-update
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P93318 and previous config saved to /var/cache/conftool/dbconfig/20260528-081316-fceratto.json
* 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:09 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1209: Test
* 08:05 hashar@deploy1003: Finished deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016 (duration: 00m 13s)
* 08:05 hashar@deploy1003: Started deploy [integration/docroot@2a51016]: build: update dependencies + eslint fix in comment. f021d3f..2a51016
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93315 and previous config saved to /var/cache/conftool/dbconfig/20260528-080309-fceratto.json
* 07:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93314 and previous config saved to /var/cache/conftool/dbconfig/20260528-075631-fceratto.json
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020,1022-1023].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
* 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93313 and previous config saved to /var/cache/conftool/dbconfig/20260528-075521-fceratto.json
* 07:47 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93311 and previous config saved to /var/cache/conftool/dbconfig/20260528-074513-fceratto.json
* 07:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2189: repool after crash
* 07:36 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrade GitLab replica
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93309 and previous config saved to /var/cache/conftool/dbconfig/20260528-073506-fceratto.json
* 07:34 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:29 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] (duration: 06m 29s)
* 07:25 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Continuing with deployment
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93308 and previous config saved to /var/cache/conftool/dbconfig/20260528-072458-fceratto.json
* 07:24 wmde-fisch@deploy1003: thiemowmde, wmde-fisch: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:24 jelto@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrade GitLab replica
* 07:23 tgr@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=enwikisource --logwiki=metawiki Ioed Renamed_user_4232d41570b9e8f46ef150e5e360e446 # [[phab:T427459|T427459]]
* 07:22 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1294808{{!}}Don't run the click intent experiment on mobile (T426743)]]
* 07:20 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] (duration: 06m 54s)
* 07:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93307 and previous config saved to /var/cache/conftool/dbconfig/20260528-071836-fceratto.json
* 07:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 07:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1167: Reboot completed
* 07:16 wmde-fisch@deploy1003: wmde-fisch, robertsky: Continuing with deployment
* 07:15 wmde-fisch@deploy1003: wmde-fisch, robertsky: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1270986{{!}}Update wikimania wordmark for 2026 (T413331)]]
* 07:11 wmde-fisch@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] (duration: 07m 15s)
* 07:07 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Continuing with deployment
* 07:06 wmde-fisch@deploy1003: wmde-fisch, arthurtaylor: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 wmde-fisch@deploy1003: Started scap sync-world: Backport for [[gerrit:1289898{{!}}Disable support for PHP-serialized EntityData on Wikidata production (T98035)]]
* 06:43 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1167: Reboot completed
* 06:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93303 and previous config saved to /var/cache/conftool/dbconfig/20260528-064217-fceratto.json
* 06:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93302 and previous config saved to /var/cache/conftool/dbconfig/20260528-063357-fceratto.json
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:25 hashar: Restarting CI Jenkins for plugins upgrades
* 06:16 fceratto@dns1005: END - running authdns-update
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1209 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93301 and previous config saved to /var/cache/conftool/dbconfig/20260528-061609-fceratto.json
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1193 to s8 primary and set section read-write [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93300 and previous config saved to /var/cache/conftool/dbconfig/20260528-061138-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93299 and previous config saved to /var/cache/conftool/dbconfig/20260528-061048-fceratto.json
* 06:10 federico3: Starting s8 eqiad failover from db1209 to db1193 - [[phab:T426095|T426095]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1193 with weight 0 [[phab:T426095|T426095]]', diff saved to https://phabricator.wikimedia.org/P93298 and previous config saved to /var/cache/conftool/dbconfig/20260528-060412-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s8 [[phab:T426095|T426095]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:53 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:53 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new subnet in eqsin - pt1979@cumin2002"
* 00:49 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:25 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] (duration: 07m 12s)
* 00:21 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294470{{!}}Activate conductwiki (T426984)]]
* 00:12 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] (duration: 07m 25s)
* 00:09 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 00:08 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:06 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:04 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1294438{{!}}Init conductwiki (T426984)]]
* 00:04 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
== 2026-05-27 ==
* 23:13 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] (duration: 08m 42s)
* 23:09 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Continuing with deployment
* 23:06 jdlrobson@deploy1003: jdlrobson, h2o, egardner: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:04 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294432{{!}}Exclude more content from selection (T426308)]], [[gerrit:1285523{{!}}Remove MinervaNightMode config after skin cleanup (T426689)]]
* 22:58 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] (duration: 07m 49s)
* 22:55 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.sanitarium_restart (exit_code=0)
* 22:54 catrope@deploy1003: catrope: Continuing with deployment
* 22:52 catrope@deploy1003: catrope: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:50 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294435{{!}}passwordlessLogin: Limit conditional mediation to the main login form (T427419)]]
* 22:46 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] (duration: 06m 54s)
* 22:42 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:41 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:40 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitarium_restart (exit_code=99)
* 22:40 ladsgroup@cumin1003: START - Cookbook sre.mysql.sanitarium_restart
* 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294360{{!}}Thumbnails are not being optimized in large mode (T427237)]], [[gerrit:1294322{{!}}Thumbnails are not being optimized in large mode (T427237)]]
* 22:39 ladsgroup@deploy1003: Finished scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) (duration: 07m 16s)
* 22:35 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:34 ladsgroup@deploy1003: ladsgroup: Add conduct.wikimedia.org ([[phab:T426984|T426984]]) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:33 ladsgroup@deploy1003: Started scap sync-world: Add conduct.wikimedia.org ([[phab:T426984|T426984]])
* 22:13 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] (duration: 10m 00s)
* 22:09 egardner@deploy1003: egardner: Continuing with deployment
* 22:05 egardner@deploy1003: egardner: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:03 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1294370{{!}}Carousel only on articles (T427336)]]
* 21:37 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on relforge[1008-1010].eqiad.wmnet with reason: non-production environment
* 21:20 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 21:20 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 21:19 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 21:04 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] (duration: 07m 38s)
* 20:59 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Continuing with deployment
* 20:58 ebernhardson@deploy1003: matmarex, ebernhardson, pppery: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:56 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288370{{!}}Allow Vector 2022 font size changes in namespace 100 for enwiktionary (T423766)]], [[gerrit:1293819{{!}}Fix case of 'commonsfinder' in $wgUrlProtocols (T426614)]]
* 20:51 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 30s)
* 20:47 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:46 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:44 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294373{{!}}identity: Prune private ips from x-forwarded-for (T407432)]], [[gerrit:1294374{{!}}Revert^2 "cirrus: AB test query suggester variants" (T407432)]]
* 20:43 swfrench-wmf: reprepro include dh-php_5.5+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:39 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts lvs1016.eqiad.wmnet
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:38 swfrench-wmf: reprepro include php-defaults_94+wmf12u1 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:37 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: lvs1016.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:31 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:27 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf12u2 into component/php83 for bookworm-wikimedia - [[phab:T427312|T427312]]
* 20:25 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts lvs1016.eqiad.wmnet
* 20:25 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] (duration: 08m 11s)
* 20:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1016.eqiad.wmnet with OS bullseye
* 20:21 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 20:19 sbisson@deploy1003: sbisson: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:17 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1294342{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294343{{!}}Allow disabling experiment for experienced editors (>=100 edits) (T426871)]], [[gerrit:1294344{{!}}frwiki: restrict Article Guidance experiment to junior editors (T426871)]]
* 20:14 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12355
* 20:04 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12355
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
* 19:48 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:48 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:46 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:45 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5032.eqsin.wmnet
* 19:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 19:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 19:01 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f] (duration: 02m 08s)
* 18:59 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (thin): Regular analytics weekly train THIN [analytics/refinery@96cf761f]
* 18:58 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 05m 01s)
* 18:53 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:53 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] (duration: 07m 41s)
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5031.eqsin.wmnet
* 18:49 catrope@deploy1003: catrope: Continuing with deployment
* 18:47 catrope@deploy1003: catrope: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:45 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1294376{{!}}Fix lastAuthTimestamp hack (T427398)]], [[gerrit:1294375{{!}}auth: Mark the hidden token field used for reauth as skippable (T427398)]]
* 18:40 joal@deploy1003: Finished deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f] (duration: 01m 05s)
* 18:39 joal@deploy1003: Started deploy [analytics/refinery@96cf761]: Regular analytics weekly train [analytics/refinery@96cf761f]
* 18:37 joal@deploy1003: Finished deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f] (duration: 02m 04s)
* 18:35 joal@deploy1003: Started deploy [analytics/refinery@96cf761] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@96cf761f]
* 18:29 swfrench@deploy1003: Finished scap sync-world: Helmfile-only deployment to clean up unused mesh listeners (duration: 06m 12s)
* 18:25 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:24 swfrench@deploy1003: swfrench: Helmfile-only deployment to clean up unused mesh listeners synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:23 swfrench@deploy1003: Started scap sync-world: Helmfile-only deployment to clean up unused mesh listeners
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93296 and previous config saved to /var/cache/conftool/dbconfig/20260527-181923-fceratto.json
* 18:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:11 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 18:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93295 and previous config saved to /var/cache/conftool/dbconfig/20260527-180915-fceratto.json
* 18:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 18:09 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] (duration: 10m 24s)
* 18:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp5024.eqsin.wmnet
* 18:03 swfrench@deploy1003: swfrench: Continuing with deployment
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:02 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:01 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:01 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:00 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:00 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:00 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264', diff saved to https://phabricator.wikimedia.org/P93294 and previous config saved to /var/cache/conftool/dbconfig/20260527-175908-fceratto.json
* 17:58 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293776{{!}}ProductionServices: Revert to discovery shellbox listeners]]
* 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93293 and previous config saved to /var/cache/conftool/dbconfig/20260527-174900-fceratto.json
* 17:43 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] (duration: 15m 01s)
* 17:38 swfrench@deploy1003: swfrench: Continuing with deployment
* 17:31 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:28 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293774{{!}}ProductionServices: Temporarily use shellbox in codfw]]
* 17:25 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1114.eqiad.wmnet
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:13 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:05 swfrench@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] (duration: 08m 44s)
* 17:00 swfrench@deploy1003: swfrench: Continuing with deployment
* 16:58 swfrench@deploy1003: swfrench: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:56 swfrench@deploy1003: Started scap sync-world: Backport for [[gerrit:1293775{{!}}ProductionServices: Temporarily use shellbox in eqiad]]
* 16:53 atsuko@dns1004: END - running authdns-update
* 16:51 atsuko@dns1004: START - running authdns-update
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1264 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93292 and previous config saved to /var/cache/conftool/dbconfig/20260527-164846-fceratto.json
* 16:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1264.eqiad.wmnet with reason: Maintenance
* 16:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93291 and previous config saved to /var/cache/conftool/dbconfig/20260527-164815-fceratto.json
* 16:43 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp1112.eqiad.wmnet
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: Setting up
* 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93290 and previous config saved to /var/cache/conftool/dbconfig/20260527-163808-fceratto.json
* 16:37 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Repooling after testing patch
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P93287 and previous config saved to /var/cache/conftool/dbconfig/20260527-162800-fceratto.json
* 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93285 and previous config saved to /var/cache/conftool/dbconfig/20260527-161753-fceratto.json
* 16:14 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 16:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 16:12 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 16:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93284 and previous config saved to /var/cache/conftool/dbconfig/20260527-161101-fceratto.json
* 16:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
* 16:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93283 and previous config saved to /var/cache/conftool/dbconfig/20260527-161034-fceratto.json
* 16:10 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 16:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1178: Recovering from failure in cookbook
* 16:10 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 16:05 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host durum5003.eqsin.wmnet with OS trixie
* 16:03 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6016.drmrs.wmnet
* 16:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93280 and previous config saved to /var/cache/conftool/dbconfig/20260527-160027-fceratto.json
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1017.eqiad.wmnet
* 15:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2163.codfw.wmnet
* 15:53 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2163.codfw.wmnet
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1017.eqiad.wmnet
* 15:52 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Repooling after testing patch
* 15:52 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6016.drmrs.wmnet,cp[1112,1114].eqiad.wmnet,cp[5024,5031-5032].eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 15:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Testing cookbook
* 15:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Testing cookbook
* 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220', diff saved to https://phabricator.wikimedia.org/P93276 and previous config saved to /var/cache/conftool/dbconfig/20260527-155019-fceratto.json
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93274 and previous config saved to /var/cache/conftool/dbconfig/20260527-154011-fceratto.json
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 15:32 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2163: Migration of db2163.codfw.wmnet completed
* 15:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1178: Recovering from failure in cookbook
* 15:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1178.eqiad.wmnet
* 15:22 cwilliams@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1178.eqiad.wmnet
* 15:19 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 15:19 cdanis: 💙cdanis@cp4047.ulsfo.wmnet ~ 🕦☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:13 cdanis: 💙cdanis@cp5026.eqsin.wmnet ~ 🕚☕ sudo apt install lua5.4-ciderbloom lua5.4-ciderbloom-dbgsym
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:11 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Icinga wait failed during run
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:09 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕚☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1220 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93268 and previous config saved to /var/cache/conftool/dbconfig/20260527-150508-fceratto.json
* 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1220.eqiad.wmnet with reason: Maintenance
* 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93267 and previous config saved to /var/cache/conftool/dbconfig/20260527-150438-fceratto.json
* 14:59 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2163: Migration of db2163.codfw.wmnet completed
* 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93264 and previous config saved to /var/cache/conftool/dbconfig/20260527-145430-fceratto.json
* 14:54 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2163.codfw.wmnet with OS trixie
* 14:51 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 14:46 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] (duration: 08m 32s)
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS trixie
* 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P93263 and previous config saved to /var/cache/conftool/dbconfig/20260527-144423-fceratto.json
* 14:42 aude@deploy1003: aude: Continuing with deployment
* 14:40 aude@deploy1003: aude: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2189.codfw.wmnet with reason: crashed [[phab:T427376|T427376]]
* 14:38 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290926{{!}}Re-enable ReadingLists QuickSurvey (T426781)]]
* 14:35 aude@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] (duration: 11m 30s)
* 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93262 and previous config saved to /var/cache/conftool/dbconfig/20260527-143416-fceratto.json
* 14:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2163.codfw.wmnet with reason: host reimage
* 14:29 aude@deploy1003: aude: Continuing with deployment
* 14:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:27 aude@deploy1003: aude: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:27 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93260 and previous config saved to /var/cache/conftool/dbconfig/20260527-142659-fceratto.json
* 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:23 aude@deploy1003: Started scap sync-world: Backport for [[gerrit:1290924{{!}}Make logging of title and page ID optional (T426457)]]
* 14:22 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
* 14:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1033.eqiad.wmnet with reason: Maintenance
* 14:18 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] (duration: 33m 01s)
* 14:10 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2163.codfw.wmnet with OS trixie
* 14:09 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS trixie
* 14:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2163: Upgrading db2163.codfw.wmnet
* 14:08 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1178: Upgrading db1178.eqiad.wmnet
* 14:07 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1178: Upgrading db1178.eqiad.wmnet
* 14:06 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:06 stran@deploy1003: stran: Continuing with deployment
* 14:02 stran@deploy1003: stran: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:56 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2164: Migration of db2164.codfw.wmnet completed
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 13:51 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:45 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1294247{{!}}Update Direct Reporting email (T427358)]]
* 13:40 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] (duration: 11m 35s)
* 13:36 phuedx@deploy1003: phuedx: Continuing with deployment
* 13:30 phuedx@deploy1003: phuedx: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1294217{{!}}ext.wikimediaEvents: Add hoisting error detection test (T427092)]]
* 13:21 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] (duration: 13m 23s)
* 13:15 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2189: Test
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 13:15 mlitn@deploy1003: krinkle, mlitn: Continuing with deployment
* 13:13 mlitn@deploy1003: krinkle, mlitn: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:10 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2164: Migration of db2164.codfw.wmnet completed
* 13:08 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1290781{{!}}mmv: Fix missing or stale arrow and counter controls (T426960)]], [[gerrit:1294264{{!}}MMV Carousel: Restore click-to-open for carousel thumbnails (T426225)]]
* 13:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 99 days, 0:00:00 on db2212.codfw.wmnet with reason: failed to reboot [[phab:T427388|T427388]] [[phab:T426633|T426633]]
* 13:05 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1192: Migration of db1192.eqiad.wmnet completed
* 13:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2164.codfw.wmnet with OS trixie
* 12:57 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1192.eqiad.wmnet with OS trixie
* 12:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2164.codfw.wmnet with reason: host reimage
* 12:35 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1192.eqiad.wmnet with reason: host reimage
* 12:28 Amir1: deleting binlogs older than a year
* 12:22 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2164.codfw.wmnet with OS trixie
* 12:21 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36692
* 12:21 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1192.eqiad.wmnet with OS trixie
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1077
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 12:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2164: Upgrading db2164.codfw.wmnet
* 12:20 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 36692
* 12:20 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 12:20 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1079
* 12:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2164: Upgrading db2164.codfw.wmnet
* 12:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 12:19 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:19 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1192: Upgrading db1192.eqiad.wmnet
* 12:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1192: Upgrading db1192.eqiad.wmnet
* 12:18 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:15 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Migration of db2165.codfw.wmnet completed
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:14 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1078 to eqiad - jclark@cumin1003"
* 12:12 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2189: Test
* 12:11 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2189: Test
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1193: Migration of db1193.eqiad.wmnet completed
* 12:09 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2212 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93243 and previous config saved to /var/cache/conftool/dbconfig/20260527-120452-fceratto.json
* 12:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93242 and previous config saved to /var/cache/conftool/dbconfig/20260527-120205-fceratto.json
* 12:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:58 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:58 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "is everything alright? /cc effie - ayounsi@cumin1003"
* 11:56 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93239 and previous config saved to /var/cache/conftool/dbconfig/20260527-115157-fceratto.json
* 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P93237 and previous config saved to /var/cache/conftool/dbconfig/20260527-114149-fceratto.json
* 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93235 and previous config saved to /var/cache/conftool/dbconfig/20260527-113142-fceratto.json
* 11:29 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Migration of db2165.codfw.wmnet completed
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1193: Migration of db1193.eqiad.wmnet completed
* 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93231 and previous config saved to /var/cache/conftool/dbconfig/20260527-112327-fceratto.json
* 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 11:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93230 and previous config saved to /var/cache/conftool/dbconfig/20260527-112257-fceratto.json
* 11:19 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2165.codfw.wmnet with OS trixie
* 11:15 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1193.eqiad.wmnet with OS trixie
* 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93229 and previous config saved to /var/cache/conftool/dbconfig/20260527-111250-fceratto.json
* 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:08 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P93227 and previous config saved to /var/cache/conftool/dbconfig/20260527-110242-fceratto.json
* 11:02 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
* 11:02 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 11:01 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 11:01 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2189', diff saved to https://phabricator.wikimedia.org/P93226 and previous config saved to /var/cache/conftool/dbconfig/20260527-110016-marostegui.json
* 10:58 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:57 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2165.codfw.wmnet with reason: host reimage
* 10:56 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93225 and previous config saved to /var/cache/conftool/dbconfig/20260527-105235-fceratto.json
* 10:52 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1193.eqiad.wmnet with reason: host reimage
* 10:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93223 and previous config saved to /var/cache/conftool/dbconfig/20260527-104518-fceratto.json
* 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93222 and previous config saved to /var/cache/conftool/dbconfig/20260527-104449-fceratto.json
* 10:39 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2165.codfw.wmnet with OS trixie
* 10:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1193.eqiad.wmnet with OS trixie
* 10:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1193: Upgrading db1193.eqiad.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2165: Upgrading db2165.codfw.wmnet
* 10:35 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2165: Upgrading db2165.codfw.wmnet
* 10:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93218 and previous config saved to /var/cache/conftool/dbconfig/20260527-103441-fceratto.json
* 10:29 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:29 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P93217 and previous config saved to /var/cache/conftool/dbconfig/20260527-102434-fceratto.json
* 10:22 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:21 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93215 and previous config saved to /var/cache/conftool/dbconfig/20260527-101426-fceratto.json
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1203: Migration of db1203.eqiad.wmnet completed
* 10:10 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:10 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2166: Migration of db2166.codfw.wmnet completed
* 10:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93212 and previous config saved to /var/cache/conftool/dbconfig/20260527-100701-fceratto.json
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 10:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93211 and previous config saved to /var/cache/conftool/dbconfig/20260527-100632-fceratto.json
* 10:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 10:04 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1050.eqiad.wmnet with OS trixie
* 09:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93208 and previous config saved to /var/cache/conftool/dbconfig/20260527-095624-fceratto.json
* 09:47 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P93206 and previous config saved to /var/cache/conftool/dbconfig/20260527-094616-fceratto.json
* 09:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:43 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es1050.eqiad.wmnet with reason: host reimage
* 09:38 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:38 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:37 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:37 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 09:36 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93203 and previous config saved to /var/cache/conftool/dbconfig/20260527-093609-fceratto.json
* 09:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93202 and previous config saved to /var/cache/conftool/dbconfig/20260527-092842-fceratto.json
* 09:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 09:28 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1203: Migration of db1203.eqiad.wmnet completed
* 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93200 and previous config saved to /var/cache/conftool/dbconfig/20260527-092814-fceratto.json
* 09:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es1050.eqiad.wmnet with OS trixie
* 09:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es1050: Upgrading es1050.eqiad.wmnet
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1050: repool after maintenance
* 09:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es1050: repool after maintenance
* 09:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2166: Migration of db2166.codfw.wmnet completed
* 09:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2051: repool after maintenance
* 09:20 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS trixie
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93196 and previous config saved to /var/cache/conftool/dbconfig/20260527-091806-fceratto.json
* 09:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2166.codfw.wmnet with OS trixie
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P93194 and previous config saved to /var/cache/conftool/dbconfig/20260527-090759-fceratto.json
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3074.*
* 09:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp3066.*
* 09:03 fabfur: repooling cp3074 and cp3066 ([[phab:T419825|T419825]])
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp6015.drmrs.wmnet
* 09:02 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for cp6015.drmrs.wmnet
* 09:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 09:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=cp6015.*
* 08:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93193 and previous config saved to /var/cache/conftool/dbconfig/20260527-085751-fceratto.json
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
* 08:54 Emperor: restart swift on ms-fe2011 [[phab:T360913|T360913]]
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:54 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2166.codfw.wmnet with reason: host reimage
* 08:54 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:53 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 08:52 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 08:51 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3066.*
* 08:51 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp3074.*
* 08:51 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:50 fabfur: depooling and installing haproxy-awslc on cp3074 and cp3066 ([[phab:T419825|T419825]])
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93191 and previous config saved to /var/cache/conftool/dbconfig/20260527-085024-fceratto.json
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93190 and previous config saved to /var/cache/conftool/dbconfig/20260527-085005-fceratto.json
* 08:41 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS trixie
* 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93189 and previous config saved to /var/cache/conftool/dbconfig/20260527-083957-fceratto.json
* 08:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2051: repool after maintenance
* 08:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 08:36 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1203: Upgrading db1203.eqiad.wmnet
* 08:36 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:35 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2166.codfw.wmnet with OS trixie
* 08:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2051.codfw.wmnet with OS trixie
* 08:34 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2166: Upgrading db2166.codfw.wmnet
* 08:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P93185 and previous config saved to /var/cache/conftool/dbconfig/20260527-082950-fceratto.json
* 08:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93184 and previous config saved to /var/cache/conftool/dbconfig/20260527-081942-fceratto.json
* 08:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2051.codfw.wmnet with reason: host reimage
* 08:11 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93183 and previous config saved to /var/cache/conftool/dbconfig/20260527-081112-fceratto.json
* 08:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93182 and previous config saved to /var/cache/conftool/dbconfig/20260527-081054-fceratto.json
* 08:07 jmm@dns1004: END - running authdns-update
* 08:05 jmm@dns1004: START - running authdns-update
* 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93181 and previous config saved to /var/cache/conftool/dbconfig/20260527-080046-fceratto.json
* 07:59 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2051.codfw.wmnet with OS trixie
* 07:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P93180 and previous config saved to /var/cache/conftool/dbconfig/20260527-075039-fceratto.json
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1026.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1026.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2051: Upgrading es2051.codfw.wmnet
* 07:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2051: Upgrading es2051.codfw.wmnet
* 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93178 and previous config saved to /var/cache/conftool/dbconfig/20260527-074031-fceratto.json
* 07:40 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] (duration: 06m 42s)
* 07:36 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 07:35 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93177 and previous config saved to /var/cache/conftool/dbconfig/20260527-073504-fceratto.json
* 07:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93176 and previous config saved to /var/cache/conftool/dbconfig/20260527-073434-fceratto.json
* 07:33 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1294125{{!}}Add script to demote ineligible members of restricted global groups (T425395)]], [[gerrit:1294126{{!}}Add script to demote ineligible members of restricted global groups (T425395)]]
* 07:28 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93175 and previous config saved to /var/cache/conftool/dbconfig/20260527-072426-fceratto.json
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.decommission (exit_code=0)
* 07:23 marostegui@cumin1003: Removing pc1014 from zarcillo [[phab:T427190|T427190]]
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1014.eqiad.wmnet
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:23 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:18 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:15 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1026.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1025.eqiad.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P93174 and previous config saved to /var/cache/conftool/dbconfig/20260527-071418-fceratto.json
* 07:13 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1014.eqiad.wmnet
* 07:13 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1025.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2003.wikimedia.org
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2055: repool after maintenance
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2003.wikimedia.org
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1003.wikimedia.org
* 07:06 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1190.eqiad.wmnet with reason: Maintenance on db1190
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93172 and previous config saved to /var/cache/conftool/dbconfig/20260527-070410-fceratto.json
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1003.wikimedia.org
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93171 and previous config saved to /var/cache/conftool/dbconfig/20260527-065545-fceratto.json
* 06:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 06:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93170 and previous config saved to /var/cache/conftool/dbconfig/20260527-065526-fceratto.json
* 06:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1025.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93168 and previous config saved to /var/cache/conftool/dbconfig/20260527-064519-fceratto.json
* 06:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P93166 and previous config saved to /var/cache/conftool/dbconfig/20260527-063511-fceratto.json
* 06:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93165 and previous config saved to /var/cache/conftool/dbconfig/20260527-062503-fceratto.json
* 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2055: repool after maintenance
* 06:21 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2055.codfw.wmnet with OS trixie
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93163 and previous config saved to /var/cache/conftool/dbconfig/20260527-061643-fceratto.json
* 06:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93162 and previous config saved to /var/cache/conftool/dbconfig/20260527-061613-fceratto.json
* 06:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93161 and previous config saved to /var/cache/conftool/dbconfig/20260527-060606-fceratto.json
* 06:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on es2055.codfw.wmnet with reason: host reimage
* 05:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P93160 and previous config saved to /var/cache/conftool/dbconfig/20260527-055558-fceratto.json
* 05:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93159 and previous config saved to /var/cache/conftool/dbconfig/20260527-054550-fceratto.json
* 05:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host es2055.codfw.wmnet with OS trixie
* 05:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2055: Upgrading es2055.codfw.wmnet
* 05:40 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 05:38 moritzm: remove ganeti1026 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93157 and previous config saved to /var/cache/conftool/dbconfig/20260527-053727-fceratto.json
* 05:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93156 and previous config saved to /var/cache/conftool/dbconfig/20260527-053708-fceratto.json
* 05:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93155 and previous config saved to /var/cache/conftool/dbconfig/20260527-052700-fceratto.json
* 05:26 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1014 from dbctl [[phab:T427270|T427270]]', diff saved to https://phabricator.wikimedia.org/P93154 and previous config saved to /var/cache/conftool/dbconfig/20260527-052624-marostegui.json
* 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P93153 and previous config saved to /var/cache/conftool/dbconfig/20260527-051653-fceratto.json
* 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93152 and previous config saved to /var/cache/conftool/dbconfig/20260527-050645-fceratto.json
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93151 and previous config saved to /var/cache/conftool/dbconfig/20260527-045827-fceratto.json
* 04:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 04:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93150 and previous config saved to /var/cache/conftool/dbconfig/20260527-045759-fceratto.json
* 04:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93149 and previous config saved to /var/cache/conftool/dbconfig/20260527-044751-fceratto.json
* 04:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P93148 and previous config saved to /var/cache/conftool/dbconfig/20260527-043744-fceratto.json
* 04:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93147 and previous config saved to /var/cache/conftool/dbconfig/20260527-042737-fceratto.json
* 04:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93146 and previous config saved to /var/cache/conftool/dbconfig/20260527-041921-fceratto.json
* 04:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 04:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93145 and previous config saved to /var/cache/conftool/dbconfig/20260527-041852-fceratto.json
* 04:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93144 and previous config saved to /var/cache/conftool/dbconfig/20260527-040844-fceratto.json
* 03:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P93143 and previous config saved to /var/cache/conftool/dbconfig/20260527-035836-fceratto.json
* 03:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93142 and previous config saved to /var/cache/conftool/dbconfig/20260527-034828-fceratto.json
* 03:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93141 and previous config saved to /var/cache/conftool/dbconfig/20260527-034008-fceratto.json
* 03:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 03:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93140 and previous config saved to /var/cache/conftool/dbconfig/20260527-033938-fceratto.json
* 03:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93139 and previous config saved to /var/cache/conftool/dbconfig/20260527-032931-fceratto.json
* 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P93138 and previous config saved to /var/cache/conftool/dbconfig/20260527-031923-fceratto.json
* 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93137 and previous config saved to /var/cache/conftool/dbconfig/20260527-030915-fceratto.json
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93136 and previous config saved to /var/cache/conftool/dbconfig/20260527-030045-fceratto.json
* 03:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 03:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93135 and previous config saved to /var/cache/conftool/dbconfig/20260527-030016-fceratto.json
* 02:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93134 and previous config saved to /var/cache/conftool/dbconfig/20260527-025008-fceratto.json
* 02:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P93133 and previous config saved to /var/cache/conftool/dbconfig/20260527-024000-fceratto.json
* 02:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93132 and previous config saved to /var/cache/conftool/dbconfig/20260527-022953-fceratto.json
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93131 and previous config saved to /var/cache/conftool/dbconfig/20260527-022133-fceratto.json
* 02:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 02:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93130 and previous config saved to /var/cache/conftool/dbconfig/20260527-022100-fceratto.json
* 02:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93129 and previous config saved to /var/cache/conftool/dbconfig/20260527-021053-fceratto.json
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P93128 and previous config saved to /var/cache/conftool/dbconfig/20260527-020045-fceratto.json
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93127 and previous config saved to /var/cache/conftool/dbconfig/20260527-015037-fceratto.json
* 01:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93126 and previous config saved to /var/cache/conftool/dbconfig/20260527-014204-fceratto.json
* 01:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 01:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93125 and previous config saved to /var/cache/conftool/dbconfig/20260527-014134-fceratto.json
* 01:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93124 and previous config saved to /var/cache/conftool/dbconfig/20260527-013126-fceratto.json
* 01:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P93123 and previous config saved to /var/cache/conftool/dbconfig/20260527-012119-fceratto.json
* 01:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93122 and previous config saved to /var/cache/conftool/dbconfig/20260527-011111-fceratto.json
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93121 and previous config saved to /var/cache/conftool/dbconfig/20260527-010234-fceratto.json
* 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93120 and previous config saved to /var/cache/conftool/dbconfig/20260527-010205-fceratto.json
* 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93119 and previous config saved to /var/cache/conftool/dbconfig/20260527-005157-fceratto.json
* 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P93118 and previous config saved to /var/cache/conftool/dbconfig/20260527-004149-fceratto.json
* 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93117 and previous config saved to /var/cache/conftool/dbconfig/20260527-003141-fceratto.json
* 00:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93116 and previous config saved to /var/cache/conftool/dbconfig/20260527-002309-fceratto.json
* 00:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 00:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93115 and previous config saved to /var/cache/conftool/dbconfig/20260527-002228-fceratto.json
* 00:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93114 and previous config saved to /var/cache/conftool/dbconfig/20260527-001220-fceratto.json
* 00:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P93113 and previous config saved to /var/cache/conftool/dbconfig/20260527-000209-fceratto.json
== 2026-05-26 ==
* 23:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93112 and previous config saved to /var/cache/conftool/dbconfig/20260526-235201-fceratto.json
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93111 and previous config saved to /var/cache/conftool/dbconfig/20260526-234451-fceratto.json
* 23:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 23:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93110 and previous config saved to /var/cache/conftool/dbconfig/20260526-234421-fceratto.json
* 23:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93109 and previous config saved to /var/cache/conftool/dbconfig/20260526-233414-fceratto.json
* 23:27 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 23:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P93108 and previous config saved to /var/cache/conftool/dbconfig/20260526-232406-fceratto.json
* 23:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93107 and previous config saved to /var/cache/conftool/dbconfig/20260526-231358-fceratto.json
* 23:07 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp5026.*
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93106 and previous config saved to /var/cache/conftool/dbconfig/20260526-230650-fceratto.json
* 23:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 23:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93105 and previous config saved to /var/cache/conftool/dbconfig/20260526-230620-fceratto.json
* 22:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93104 and previous config saved to /var/cache/conftool/dbconfig/20260526-225612-fceratto.json
* 22:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P93103 and previous config saved to /var/cache/conftool/dbconfig/20260526-224604-fceratto.json
* 22:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93101 and previous config saved to /var/cache/conftool/dbconfig/20260526-223556-fceratto.json
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93100 and previous config saved to /var/cache/conftool/dbconfig/20260526-222848-fceratto.json
* 22:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 22:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93099 and previous config saved to /var/cache/conftool/dbconfig/20260526-222828-fceratto.json
* 22:23 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp6015.drmrs.wmnet
* 22:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93098 and previous config saved to /var/cache/conftool/dbconfig/20260526-221819-fceratto.json
* 22:10 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS trixie
* 22:08 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS trixie
* 22:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P93097 and previous config saved to /var/cache/conftool/dbconfig/20260526-220811-fceratto.json
* 22:04 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] (duration: 09m 30s)
* 22:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 22:00 egardner@deploy1003: egardner, mfossati: Continuing with deployment
* 21:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93096 and previous config saved to /var/cache/conftool/dbconfig/20260526-215803-fceratto.json
* 21:57 egardner@deploy1003: egardner, mfossati: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:56 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:56 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1010.eqiad.wmnet with OS trixie
* 21:56 robh@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:55 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1293701{{!}}MultimediaViewer: enable image carousel as a beta feature on testwiki (T426799)]]
* 21:54 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1009.eqiad.wmnet with reason: host reimage
* 21:51 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1008.eqiad.wmnet with reason: host reimage
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93095 and previous config saved to /var/cache/conftool/dbconfig/20260526-215043-fceratto.json
* 21:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 21:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93094 and previous config saved to /var/cache/conftool/dbconfig/20260526-215011-fceratto.json
* 21:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:47 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp6015.drmrs.wmnet
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1009
* 21:44 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1009
* 21:43 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1009
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1009.eqiad.wmnet 120.48.64.10.in-addr.arpa 0.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:43 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:42 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on relforge1010.eqiad.wmnet with reason: host reimage
* 21:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1009 - bking@cumin2002"
* 21:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1008
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host relforge1008
* 21:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93093 and previous config saved to /var/cache/conftool/dbconfig/20260526-214003-fceratto.json
* 21:36 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host relforge1008
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache relforge1008.eqiad.wmnet 100.32.64.10.in-addr.arpa 0.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:36 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:36 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host relforge1008 - bking@cumin2002"
* 21:35 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host relforge1010
* 21:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1010
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1010.eqiad.wmnet with OS trixie
* 21:31 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1009
* 21:30 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS trixie
* 21:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P93092 and previous config saved to /var/cache/conftool/dbconfig/20260526-212955-fceratto.json
* 21:29 bking@cumin2002: START - Cookbook sre.dns.netbox
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host relforge1008
* 21:29 bking@cumin2002: START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS trixie
* 21:27 Dreamy_Jazz: Running `/usr/local/bin/foreachwikiindblist "all.dblist - mediamoderation-continuous-scan.dblist - preinstall.dblist" extensions/MediaModeration/maintenance/scanFilesInScanTable.php --use-jobqueue --sleep=1 --poll-sleep=10 --verbose` in tmux session - [[phab:T421688|T421688]]
* 21:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93091 and previous config saved to /var/cache/conftool/dbconfig/20260526-211948-fceratto.json
* 21:19 jhathaway: dmarc ingress test run mx-in1001
* 21:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and A:cp
* 21:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 21:14 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and A:cp
* 21:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93090 and previous config saved to /var/cache/conftool/dbconfig/20260526-211238-fceratto.json
* 21:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 21:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93089 and previous config saved to /var/cache/conftool/dbconfig/20260526-211207-fceratto.json
* 21:06 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 21:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93088 and previous config saved to /var/cache/conftool/dbconfig/20260526-210159-fceratto.json
* 20:55 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on phab2003.codfw.wmnet with reason: WIP
* 20:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P93087 and previous config saved to /var/cache/conftool/dbconfig/20260526-205152-fceratto.json
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:50 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 20:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93086 and previous config saved to /var/cache/conftool/dbconfig/20260526-204143-fceratto.json
* 20:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 20:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93085 and previous config saved to /var/cache/conftool/dbconfig/20260526-203430-fceratto.json
* 20:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 20:34 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 20:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93084 and previous config saved to /var/cache/conftool/dbconfig/20260526-203357-fceratto.json
* 20:32 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 20:32 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 20:31 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 20:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93083 and previous config saved to /var/cache/conftool/dbconfig/20260526-202349-fceratto.json
* 20:18 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] (duration: 09m 14s)
* 20:14 alexsanford@deploy1003: alexsanford, aude: Continuing with deployment
* 20:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P93082 and previous config saved to /var/cache/conftool/dbconfig/20260526-201341-fceratto.json
* 20:11 alexsanford@deploy1003: alexsanford, aude: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1293161{{!}}Enforce 2FA requirements for phase 3 groups (T423120)]], [[gerrit:1293794{{!}}Re-enable ReadingLists survey on beta cluster (T426781)]]
* 20:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93081 and previous config saved to /var/cache/conftool/dbconfig/20260526-200333-fceratto.json
* 19:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2029.codfw.wmnet with OS trixie
* 19:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2028.codfw.wmnet with OS trixie
* 19:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93080 and previous config saved to /var/cache/conftool/dbconfig/20260526-195632-fceratto.json
* 19:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93079 and previous config saved to /var/cache/conftool/dbconfig/20260526-195557-fceratto.json
* 19:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93078 and previous config saved to /var/cache/conftool/dbconfig/20260526-194549-fceratto.json
* 19:45 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 19:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 19:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2014.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2013.codfw.wmnet with OS trixie
* 19:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host durum5003.eqsin.wmnet with OS trixie
* 19:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P93077 and previous config saved to /var/cache/conftool/dbconfig/20260526-193541-fceratto.json
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:35 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:30 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_IPs - dzahn@cumin2002"
* 19:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93076 and previous config saved to /var/cache/conftool/dbconfig/20260526-192533-fceratto.json
* 19:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:21 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 19:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 19:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS trixie
* 19:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93075 and previous config saved to /var/cache/conftool/dbconfig/20260526-191818-fceratto.json
* 19:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93074 and previous config saved to /var/cache/conftool/dbconfig/20260526-191748-fceratto.json
* 19:16 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 19:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93073 and previous config saved to /var/cache/conftool/dbconfig/20260526-190740-fceratto.json
* 19:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P93072 and previous config saved to /var/cache/conftool/dbconfig/20260526-185732-fceratto.json
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2014.codfw.wmnet with reason: host reimage
* 18:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2013.codfw.wmnet with reason: host reimage
* 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93071 and previous config saved to /var/cache/conftool/dbconfig/20260526-184724-fceratto.json
* 18:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2014.codfw.wmnet with OS trixie
* 18:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 18:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93070 and previous config saved to /var/cache/conftool/dbconfig/20260526-184009-fceratto.json
* 18:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93069 and previous config saved to /var/cache/conftool/dbconfig/20260526-183939-fceratto.json
* 18:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 18:30 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93068 and previous config saved to /var/cache/conftool/dbconfig/20260526-182931-fceratto.json
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:29 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: activate_gitlab-lb_magru-v4 - dzahn@cumin2002"
* 18:24 dzahn@cumin2002: START - Cookbook sre.dns.netbox
* 18:21 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:21 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:20 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P93066 and previous config saved to /var/cache/conftool/dbconfig/20260526-181923-fceratto.json
* 18:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 18:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93065 and previous config saved to /var/cache/conftool/dbconfig/20260526-180915-fceratto.json
* 18:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93064 and previous config saved to /var/cache/conftool/dbconfig/20260526-180205-fceratto.json
* 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93063 and previous config saved to /var/cache/conftool/dbconfig/20260526-180132-fceratto.json
* 18:00 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:54 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:54 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93062 and previous config saved to /var/cache/conftool/dbconfig/20260526-175124-fceratto.json
* 17:42 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] (duration: 07m 25s)
* 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P93060 and previous config saved to /var/cache/conftool/dbconfig/20260526-174117-fceratto.json
* 17:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2089.codfw.wmnet
* 17:37 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 17:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293779{{!}}Enable hCaptcha for VisualEditor and MobileFrontend for group0 (T425940)]]
* 17:33 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93059 and previous config saved to /var/cache/conftool/dbconfig/20260526-173109-fceratto.json
* 17:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:25 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:25 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:24 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:24 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1001 to eqiad - jclark@cumin1003"
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93058 and previous config saved to /var/cache/conftool/dbconfig/20260526-172332-fceratto.json
* 17:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2227.codfw.wmnet with reason: Maintenance
* 17:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93057 and previous config saved to /var/cache/conftool/dbconfig/20260526-172303-fceratto.json
* 17:21 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 17:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 17:20 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 17:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:17 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:16 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 17:14 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:14 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:13 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:13 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:13 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93056 and previous config saved to /var/cache/conftool/dbconfig/20260526-171255-fceratto.json
* 17:11 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:11 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P93055 and previous config saved to /var/cache/conftool/dbconfig/20260526-170247-fceratto.json
* 17:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:57 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93054 and previous config saved to /var/cache/conftool/dbconfig/20260526-165240-fceratto.json
* 16:50 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:50 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:45 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:45 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:44 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93053 and previous config saved to /var/cache/conftool/dbconfig/20260526-164421-fceratto.json
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 16:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dse-k8s-wdqs1002 to eqiad - jclark@cumin1003"
* 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93052 and previous config saved to /var/cache/conftool/dbconfig/20260526-164352-fceratto.json
* 16:42 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 16:41 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 16:40 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:40 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:40 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:40 brett: reboot lvs 101[345].eqiad.wmnet
* 16:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:37 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:37 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:36 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:36 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:35 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:34 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and A:cp
* 16:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93051 and previous config saved to /var/cache/conftool/dbconfig/20260526-163344-fceratto.json
* 16:33 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and A:cp
* 16:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:30 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P93050 and previous config saved to /var/cache/conftool/dbconfig/20260526-162336-fceratto.json
* 16:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93049 and previous config saved to /var/cache/conftool/dbconfig/20260526-161328-fceratto.json
* 16:11 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:11 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:10 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=eqiad
* 16:06 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93047 and previous config saved to /var/cache/conftool/dbconfig/20260526-160450-fceratto.json
* 16:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93046 and previous config saved to /var/cache/conftool/dbconfig/20260526-160420-fceratto.json
* 16:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:03 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 28s)
* 16:02 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 16:00 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:00 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:57 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]] (duration: 00m 22s)
* 15:55 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:55 aokoth@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2003 - [[phab:T423727|T423727]]
* 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93045 and previous config saved to /var/cache/conftool/dbconfig/20260526-155413-fceratto.json
* 15:46 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=eqiad
* 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P93044 and previous config saved to /var/cache/conftool/dbconfig/20260526-154405-fceratto.json
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93043 and previous config saved to /var/cache/conftool/dbconfig/20260526-153357-fceratto.json
* 15:30 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:30 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:29 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:28 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93042 and previous config saved to /var/cache/conftool/dbconfig/20260526-152629-fceratto.json
* 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
* 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93041 and previous config saved to /var/cache/conftool/dbconfig/20260526-152559-fceratto.json
* 15:24 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:22 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93040 and previous config saved to /var/cache/conftool/dbconfig/20260526-151552-fceratto.json
* 15:12 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2196: Rack maintenance completed
* 15:10 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2196.codfw.wmnet
* 15:10 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2196.codfw.wmnet
* 15:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=search,name=codfw
* 15:06 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: Rack maintenance completed
* 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P93037 and previous config saved to /var/cache/conftool/dbconfig/20260526-150546-fceratto.json
* 15:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: Rack maintenance completed
* 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]] (duration: 00m 39s)
* 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab1004 for [[phab:T427286|T427286]]
* 15:03 brennen@deploy1003: Finished deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]] (duration: 00m 45s)
* 15:02 brennen@deploy1003: Started deploy [phabricator/deployment@939557b]: deploy phab2002 for [[phab:T427286|T427286]]
* 15:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:01 bjensen: uploading prometheus-memcached-exporter_0.16.0-1_amd64 on apt1002
* 15:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: switch maintenance
* 14:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2196: Rack maintenance completed
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2221.codfw.wmnet
* 14:55 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db2222.codfw.wmnet
* 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93033 and previous config saved to /var/cache/conftool/dbconfig/20260526-145538-fceratto.json
* 14:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 14:52 moritzm: remove ganeti1025 from eqiad Ganeti cluster [[phab:T424680|T424680]]
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2222: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:49 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2221: Rack maintenance completed
* 14:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:49 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to cluster codfw and group A
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93030 and previous config saved to /var/cache/conftool/dbconfig/20260526-144718-fceratto.json
* 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
* 14:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93029 and previous config saved to /var/cache/conftool/dbconfig/20260526-144651-fceratto.json
* 14:45 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 14:43 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=search,name=codfw
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2167: Migration of db2167.codfw.wmnet completed
* 14:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93026 and previous config saved to /var/cache/conftool/dbconfig/20260526-143643-fceratto.json
* 14:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1054.eqiad.wmnet with OS trixie
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P93023 and previous config saved to /var/cache/conftool/dbconfig/20260526-142636-fceratto.json
* 14:26 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:24 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Rack maintenance completed
* 14:24 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
* 14:24 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Rack maintenance completed
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 14:19 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:19 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup2015.codfw.wmnet,db2197.codfw.wmnet
* 14:18 jynus: restarting mediabackups@codfw after maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93021 and previous config saved to /var/cache/conftool/dbconfig/20260526-141628-fceratto.json
* 14:16 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 14:14 fabfur: repooled cp2043 ([[phab:T426199|T426199]])
* 14:14 ayounsi@cumin1003: START - Cookbook sre.mysql.pool pool db2223: switch maintenance
* 14:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:14 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 14:13 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] (duration: 06m 40s)
* 14:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 14:10 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1054.eqiad.wmnet with reason: host reimage
* 14:10 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2011.codfw.wmnet
* 14:10 fabfur@cumin1003: START - Cookbook sre.hosts.remove-downtime for lvs2011.codfw.wmnet
* 14:09 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:09 fabfur: restoring lvs2011 as primary ([[phab:T426199|T426199]])
* 14:08 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:08 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93017 and previous config saved to /var/cache/conftool/dbconfig/20260526-140748-fceratto.json
* 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
* 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93016 and previous config saved to /var/cache/conftool/dbconfig/20260526-140718-fceratto.json
* 14:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1293710{{!}}Site info should output thumblimits as array (T427066)]]
* 14:05 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.decommission (exit_code=99)
* 14:05 marostegui@cumin1003: Removing pc1013 from zarcillo [[phab:T427190|T427190]]
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1013.eqiad.wmnet
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:04 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:04 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1013.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 14:00 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93014 and previous config saved to /var/cache/conftool/dbconfig/20260526-135711-fceratto.json
* 13:56 blake@cumin1003: START - Cookbook sre.hosts.reimage for host mc1054.eqiad.wmnet with OS trixie
* 13:55 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2167: Migration of db2167.codfw.wmnet completed
* 13:53 Amir1: drop flaggedrevs tables on cawikinews ([[phab:T423577|T423577]])
* 13:49 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1013.eqiad.wmnet
* 13:49 marostegui@cumin1003: START - Cookbook sre.mysql.decommission
* 13:48 Lucas_WMDE: UTC afternoon backport+config window done
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P93012 and previous config saved to /var/cache/conftool/dbconfig/20260526-134703-fceratto.json
* 13:46 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS trixie
* 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93011 and previous config saved to /var/cache/conftool/dbconfig/20260526-133656-fceratto.json
* 13:36 XioNoX: reboot lsw1-a2-codfw for software upgrade - [[phab:T426199|T426199]]
* 13:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2223: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2222: switch maintenance
* 13:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: switch maintenance
* 13:35 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] (duration: 09m 28s)
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2221: switch maintenance
* 13:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2196: switch maintenance
* 13:34 ayounsi@cumin1003: START - Cookbook sre.mysql.depool depool db2196: switch maintenance
* 13:31 ayounsi@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:29 ayounsi@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet,wikikube-worker[2248-2250].codfw.wmnet
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93006 and previous config saved to /var/cache/conftool/dbconfig/20260526-132927-fceratto.json
* 13:29 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
* 13:29 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Switch maintenance
* 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93005 and previous config saved to /var/cache/conftool/dbconfig/20260526-132857-fceratto.json
* 13:28 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lsw1-a2-codfw,lsw1-a2-codfw IPv6,lsw1-a2-codfw.mgmt with reason: Switch maintenance
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1293662{{!}}Enable IRS Direct Reporting on testwiki (T425025)]]
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
* 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] (duration: 08m 30s)
* 13:22 ladsgroup@dns1004: END - running authdns-update
* 13:20 ladsgroup@dns1004: START - running authdns-update
* 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93004 and previous config saved to /var/cache/conftool/dbconfig/20260526-131850-fceratto.json
* 13:18 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with deployment
* 13:16 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:14 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293706{{!}}Disable the `no` language code for translation (T424613)]]
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] (duration: 07m 09s)
* 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P93003 and previous config saved to /var/cache/conftool/dbconfig/20260526-130842-fceratto.json
* 13:08 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:07 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS trixie
* 13:07 sbisson@deploy1003: sbisson: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2167: Upgrading db2167.codfw.wmnet
* 13:05 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1293177{{!}}Instrumentation: log new articles namespace and source (T422146)]]
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2167: Upgrading db2167.codfw.wmnet
* 13:04 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:04 kart_: Update Recommendation API to 2026-05-26-074931-production
* 13:03 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:00 topranks: deactivate CR BGP to doh2002 to test backup path via doh2001
* 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P93000 and previous config saved to /var/cache/conftool/dbconfig/20260526-125834-fceratto.json
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92999 and previous config saved to /var/cache/conftool/dbconfig/20260526-125135-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
* 12:51 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92998 and previous config saved to /var/cache/conftool/dbconfig/20260526-125105-fceratto.json
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92997 and previous config saved to /var/cache/conftool/dbconfig/20260526-124059-fceratto.json
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc2003.wikimedia.org
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1214: Migration of db1214.eqiad.wmnet completed
* 12:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc2003.wikimedia.org
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P92995 and previous config saved to /var/cache/conftool/dbconfig/20260526-123052-fceratto.json
* 12:26 fabfur: depooled cp204 for network activity ([[phab:T426199|T426199]])
* 12:26 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 12:24 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ssw1-a1-codfw,ssw1-a1-codfw IPv6,ssw1-a1-codfw.mgmt with reason: Switch maintenance
* 12:24 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 12:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
* 12:23 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 12:23 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 12:22 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92993 and previous config saved to /var/cache/conftool/dbconfig/20260526-122044-fceratto.json
* 12:20 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 12:19 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92991 and previous config saved to /var/cache/conftool/dbconfig/20260526-121336-fceratto.json
* 12:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92990 and previous config saved to /var/cache/conftool/dbconfig/20260526-121306-fceratto.json
* 12:09 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned downtime for rack maintenance
* 12:08 fabfur: downtime, disable puppet and stop pybal for rack maintenance ([[phab:T426199|T426199]])
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2181: Migration of db2181.codfw.wmnet completed
* 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92987 and previous config saved to /var/cache/conftool/dbconfig/20260526-120258-fceratto.json
* 12:01 XioNoX: start ssw1-a1-codfw network maintenance (no impact expected as the spines are redundant)
* 11:59 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] (duration: 15m 26s)
* 11:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup2015.codfw.wmnet,db2197.codfw.wmnet with reason: network maintenance
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1005.eqiad.wmnet
* 11:55 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 11:54 jynus: stopping mediabackups@codfw for maintenance on a codfw backup media storage server [[phab:T426199|T426199]]
* 11:54 jmm@dns1004: END - running authdns-update
* 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92985 and previous config saved to /var/cache/conftool/dbconfig/20260526-115251-fceratto.json
* 11:52 jmm@dns1004: START - running authdns-update
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1005.eqiad.wmnet
* 11:49 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1214: Migration of db1214.eqiad.wmnet completed
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aux-k8s-etcd1004.eqiad.wmnet
* 11:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:46 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host aux-k8s-etcd1004.eqiad.wmnet
* 11:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1293167{{!}}hCaptcha: Complete rollout to all wikis (group2 + cleanup) (T425354)]], [[gerrit:1290055{{!}}hCaptcha: Exempt CommunityRequests pages from edit/create triggers (T426897)]]
* 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92983 and previous config saved to /var/cache/conftool/dbconfig/20260526-114243-fceratto.json
* 11:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:41 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1214.eqiad.wmnet with OS trixie
* 11:35 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] (duration: 06m 46s)
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92981 and previous config saved to /var/cache/conftool/dbconfig/20260526-113542-fceratto.json
* 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92980 and previous config saved to /var/cache/conftool/dbconfig/20260526-113521-fceratto.json
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Continuing with deployment
* 11:31 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1222: Migration of db1222.eqiad.wmnet completed
* 11:29 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1293691{{!}}Fix path to wikibase.wikiprojects.tracking.js (T421856 T427252)]]
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92978 and previous config saved to /var/cache/conftool/dbconfig/20260526-112513-fceratto.json
* 11:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92977 and previous config saved to /var/cache/conftool/dbconfig/20260526-112326-marostegui.json
* 11:22 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2181: Migration of db2181.codfw.wmnet completed
* 11:22 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92975 and previous config saved to /var/cache/conftool/dbconfig/20260526-112215-marostegui.json
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Switchover es2042 es2041 for [[phab:T426199|T426199]]', diff saved to https://phabricator.wikimedia.org/P92974 and previous config saved to /var/cache/conftool/dbconfig/20260526-112028-fceratto.json
* 11:17 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1214.eqiad.wmnet with reason: host reimage
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P92972 and previous config saved to /var/cache/conftool/dbconfig/20260526-111506-fceratto.json
* 11:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS trixie
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92971 and previous config saved to /var/cache/conftool/dbconfig/20260526-110458-fceratto.json
* 11:02 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1214.eqiad.wmnet with OS trixie
* 11:00 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] (duration: 15m 50s)
* 11:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1214: Upgrading db1214.eqiad.wmnet
* 10:59 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92968 and previous config saved to /var/cache/conftool/dbconfig/20260526-105755-fceratto.json
* 10:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92967 and previous config saved to /var/cache/conftool/dbconfig/20260526-105726-fceratto.json
* 10:56 jiji@deploy1003: jiji: Continuing with deployment
* 10:55 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:51 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2181.codfw.wmnet with reason: host reimage
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92966 and previous config saved to /var/cache/conftool/dbconfig/20260526-104718-fceratto.json
* 10:46 jiji@deploy1003: jiji: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:44 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1293095{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6382 (T418261 T419976)]]
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P92964 and previous config saved to /var/cache/conftool/dbconfig/20260526-103711-fceratto.json
* 10:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2181.codfw.wmnet with OS trixie
* 10:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:32 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/eventstreams-internal: apply
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92963 and previous config saved to /var/cache/conftool/dbconfig/20260526-102703-fceratto.json
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1226: Migration of db1226.eqiad.wmnet completed
* 10:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2181: Upgrading db2181.codfw.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92960 and previous config saved to /var/cache/conftool/dbconfig/20260526-101936-fceratto.json
* 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92959 and previous config saved to /var/cache/conftool/dbconfig/20260526-101842-fceratto.json
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 10:16 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:15 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 10:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] (duration: 06m 42s)
* 10:09 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92957 and previous config saved to /var/cache/conftool/dbconfig/20260526-100834-fceratto.json
* 10:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 10:05 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:03 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293668{{!}}hCaptcha: Avoid URL.searchParams in Grade C bundle (T422222)]]
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2195: Migration of db2195.codfw.wmnet completed
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 10:01 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2004.codfw.wmnet
* 10:01 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2004.codfw.wmnet
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 09:58 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92955 and previous config saved to /var/cache/conftool/dbconfig/20260526-095827-fceratto.json
* 09:58 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 09:58 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 09:57 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-eqiad@eqiad
* 09:56 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 09:55 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2004.codfw.wmnet
* 09:54 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2003.codfw.wmnet
* 09:54 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2003.codfw.wmnet
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1006.eqiad.wmnet
* 09:53 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1006.eqiad.wmnet
* 09:52 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-eqiad@eqiad
* 09:52 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] (duration: 08m 07s)
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2043.*
* 09:51 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp2044.*
* 09:48 fabfur: repooling cp2043 and cp2044 (haproxy-awslc) ([[phab:T419825|T419825]])
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92953 and previous config saved to /var/cache/conftool/dbconfig/20260526-094819-fceratto.json
* 09:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1006.eqiad.wmnet
* 09:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:44 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293665{{!}}hCaptcha: Avoid `for (const ... of ...)` in Grade C bundle (T422222)]]
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1006.eqiad.wmnet
* 09:41 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1005.eqiad.wmnet
* 09:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1005.eqiad.wmnet
* 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92951 and previous config saved to /var/cache/conftool/dbconfig/20260526-094115-fceratto.json
* 09:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 09:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3009.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92950 and previous config saved to /var/cache/conftool/dbconfig/20260526-094045-fceratto.json
* 09:40 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1226: Migration of db1226.eqiad.wmnet completed
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: aux-master-codfw@codfw
* 09:39 elukey@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:38 elukey@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:34 fabfur: depooling cp2044 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1005.eqiad.wmnet
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2003.codfw.wmnet
* 09:34 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2044.*
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1005.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1004.eqiad.wmnet
* 09:33 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1004.eqiad.wmnet
* 09:33 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp2043.*
* 09:32 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] (duration: 06m 52s)
* 09:32 fabfur: depooling cp2043 to install haproxy-awslc ([[phab:T419825|T419825]])
* 09:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1226.eqiad.wmnet with OS trixie
* 09:30 elukey@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: aux-master-codfw@codfw
* 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92947 and previous config saved to /var/cache/conftool/dbconfig/20260526-093031-fceratto.json
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2003.codfw.wmnet
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2002.codfw.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2002.codfw.wmnet
* 09:28 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:28 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:28 kharlan@deploy1003: kharlan: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1004.eqiad.wmnet
* 09:26 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage1003.eqiad.wmnet
* 09:26 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage1003.eqiad.wmnet
* 09:26 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1293661{{!}}hCaptcha: Ship a self-contained Grade C captcha bundle (T422222)]]
* 09:25 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3008.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:25 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2002.codfw.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host kubestage2001.codfw.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host kubestage2001.codfw.wmnet
* 09:21 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs3010.esams.wmnet<nowiki>}</nowiki> and A:liberica
* 09:20 fabfur: start rebooting esams liberica instances ([[phab:T426563|T426563]])
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P92946 and previous config saved to /var/cache/conftool/dbconfig/20260526-092024-fceratto.json
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage1003.eqiad.wmnet
* 09:16 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2195: Migration of db2195.codfw.wmnet completed
* 09:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage1003.eqiad.wmnet
* 09:14 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host kubestage2001.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage100*<nowiki>}</nowiki> and (A:wikikube-staging-master-eqiad or A:wikikube-staging-worker-eqiad)
* 09:14 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>kubestage200*<nowiki>}</nowiki> and (A:wikikube-staging-master-codfw or A:wikikube-staging-worker-codfw)
* 09:14 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] (duration: 06m 47s)
* 09:10 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1226.eqiad.wmnet with reason: host reimage
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92944 and previous config saved to /var/cache/conftool/dbconfig/20260526-091016-fceratto.json
* 09:09 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 09:09 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2195.codfw.wmnet with OS trixie
* 09:07 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293658{{!}}Fix TypeError in Mandatory2FAChecker (T427251)]]
* 09:06 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2224 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92943 and previous config saved to /var/cache/conftool/dbconfig/20260526-090315-fceratto.json
* 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2224.codfw.wmnet with reason: Maintenance
* 09:03 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92942 and previous config saved to /var/cache/conftool/dbconfig/20260526-090256-fceratto.json
* 08:57 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:55 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1226.eqiad.wmnet with OS trixie
* 08:53 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 fabfur: start rebooting ulsfo liberica instances ([[phab:T426563|T426563]])
* 08:53 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] (duration: 07m 23s)
* 08:53 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:53 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1226: Upgrading db1226.eqiad.wmnet
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92941 and previous config saved to /var/cache/conftool/dbconfig/20260526-085248-fceratto.json
* 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 08:51 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1226: Upgrading db1226.eqiad.wmnet
* 08:50 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5005.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:50 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:49 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Migration of db1222.eqiad.wmnet completed
* 08:48 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 08:47 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:46 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293594{{!}}Allow to remove passkeys when there's only one standard 2FA method (T426872)]]
* 08:43 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 08:43 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2195.codfw.wmnet with reason: host reimage
* 08:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] (duration: 09m 56s)
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217', diff saved to https://phabricator.wikimedia.org/P92939 and previous config saved to /var/cache/conftool/dbconfig/20260526-084240-fceratto.json
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1222.eqiad.wmnet with OS trixie
* 08:40 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5004.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:40 fabfur: start rebooting eqsin liberica instances ([[phab:T426563|T426563]])
* 08:39 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 08:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:39 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs5006.eqsin.wmnet<nowiki>}</nowiki> and A:liberica
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1024.eqiad.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:35 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:33 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1292032{{!}}Grant globalblock-local-status to groups with globalblock-whitelist (T277942)]], [[gerrit:1290964{{!}}hCaptcha CommonSettings.php: Don't define sitekeys as config vars]]
* 08:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92938 and previous config saved to /var/cache/conftool/dbconfig/20260526-083233-fceratto.json
* 08:30 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6002.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2217 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92937 and previous config saved to /var/cache/conftool/dbconfig/20260526-082531-fceratto.json
* 08:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2217.codfw.wmnet with reason: Maintenance
* 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92936 and previous config saved to /var/cache/conftool/dbconfig/20260526-082458-fceratto.json
* 08:23 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2195.codfw.wmnet with OS trixie
* 08:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:21 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2195: Upgrading db2195.codfw.wmnet
* 08:20 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2195: Upgrading db2195.codfw.wmnet
* 08:19 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:18 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1222.eqiad.wmnet with reason: host reimage
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92934 and previous config saved to /var/cache/conftool/dbconfig/20260526-081451-fceratto.json
* 08:13 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:12 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 08:10 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6001.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1024.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P92932 and previous config saved to /var/cache/conftool/dbconfig/20260526-080443-fceratto.json
* 08:01 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1222.eqiad.wmnet with OS trixie
* 08:00 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1024.eqiad.wmnet
* 07:59 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti1023.eqiad.wmnet
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:59 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 07:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti1023.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:56 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 07:56 fabfur: start rebooting drmrs liberica instances ([[phab:T426563|T426563]])
* 07:56 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92931 and previous config saved to /var/cache/conftool/dbconfig/20260526-075435-fceratto.json
* 07:52 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1047.eqiad.wmnet
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:51 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:49 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti1023.eqiad.wmnet
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92930 and previous config saved to /var/cache/conftool/dbconfig/20260526-074739-fceratto.json
* 07:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92929 and previous config saved to /var/cache/conftool/dbconfig/20260526-074710-fceratto.json
* 07:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:45 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:45 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:43 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 07:41 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 fabfur@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 07:38 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] (duration: 12m 01s)
* 07:38 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1047.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P92928 and previous config saved to /var/cache/conftool/dbconfig/20260526-073702-fceratto.json
* 07:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 fabfur@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1222: Upgrading db1222.eqiad.wmnet
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 07:35 fabfur: start rebooting magru liberica instances ([[phab:T426563|T426563]])
* 07:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92926 and previous config saved to /var/cache/conftool/dbconfig/20260526-073459-fceratto.json
* 07:32 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 07:31 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260526-072643-fceratto.json
* 07:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 07:26 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1291951{{!}}Enable and configure WikiProjects prototype on Test Wikidata (T424329)]]
* 07:25 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92924 and previous config saved to /var/cache/conftool/dbconfig/20260526-072452-fceratto.json
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 07:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92923 and previous config saved to /var/cache/conftool/dbconfig/20260526-071635-fceratto.json
* 07:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 07:15 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1026.eqiad.wmnet
* 07:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P92922 and previous config saved to /var/cache/conftool/dbconfig/20260526-071444-fceratto.json
* 07:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 07:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2180 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92921 and previous config saved to /var/cache/conftool/dbconfig/20260526-070946-fceratto.json
* 07:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92920 and previous config saved to /var/cache/conftool/dbconfig/20260526-070916-fceratto.json
* 07:09 moritzm: failover Ganeti master in eqiad to ganeti1048
* 07:09 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1047.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1046.eqiad.wmnet
* 07:07 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:06 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
* 07:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92919 and previous config saved to /var/cache/conftool/dbconfig/20260526-070436-fceratto.json
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 07:04 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1046.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 07:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 07:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92918 and previous config saved to /var/cache/conftool/dbconfig/20260526-065909-fceratto.json
* 06:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast2003.wikimedia.org
* 06:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 06:55 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 06:53 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1046.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1045.eqiad.wmnet
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:53 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P92917 and previous config saved to /var/cache/conftool/dbconfig/20260526-064901-fceratto.json
* 06:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1222 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92916 and previous config saved to /var/cache/conftool/dbconfig/20260526-064833-fceratto.json
* 06:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 06:47 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1222: Switchover
* 06:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast6003.wikimedia.org
* 06:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92914 and previous config saved to /var/cache/conftool/dbconfig/20260526-063853-fceratto.json
* 06:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast6003.wikimedia.org
* 06:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92912 and previous config saved to /var/cache/conftool/dbconfig/20260526-063155-fceratto.json
* 06:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:28 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
* 06:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1222: Switchover
* 06:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1222 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92910 and previous config saved to /var/cache/conftool/dbconfig/20260526-061656-fceratto.json
* 06:15 fceratto@dns1005: END - running authdns-update
* 06:14 fceratto@dns1005: START - running authdns-update
* 06:11 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1162 to s2 primary and set section read-write [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92909 and previous config saved to /var/cache/conftool/dbconfig/20260526-061114-fceratto.json
* 06:10 fceratto@cumin1003: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92908 and previous config saved to /var/cache/conftool/dbconfig/20260526-061021-fceratto.json
* 06:10 federico3: Starting s2 eqiad failover from db1222 to db1162 - [[phab:T425622|T425622]]
* 06:04 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1162 with weight 0 [[phab:T425622|T425622]]', diff saved to https://phabricator.wikimedia.org/P92907 and previous config saved to /var/cache/conftool/dbconfig/20260526-060443-fceratto.json
* 06:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T425622|T425622]]
* 06:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:02 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 06:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.global-read-only (exit_code=0)
* 06:00 fceratto@cumin1003: START - Cookbook sre.mysql.global-read-only
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1014.eqiad.wmnet: Maintenance on pc4
* 05:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2024.codfw.wmnet,pc[1014,1024].eqiad.wmnet with reason: Maintenance on pc4
* 04:37 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 04:34 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.47.0-wmf.1 (duration: 02m 32s)
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]] (duration: 36m 24s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.4 refs [[phab:T423913|T423913]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 20s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-25 ==
* 21:00 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1045.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:49 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 20:38 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1045.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1044.eqiad.wmnet
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:37 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:25 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1044.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 20:15 moritzm: truncate krb5kdc.log1 (which made log rotation fail)
* 20:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 19:57 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1044.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1043.eqiad.wmnet
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 19:22 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1043.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 18:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_eqiad
* 18:49 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1115.eqiad.wmnet
* 18:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:33 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.eqsin.wmnet [reason: manually pooling after reboot as icinga was down]
* 18:22 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:22 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5030.eqsin.wmnet
* 18:15 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5023.eqsin.wmnet
* 18:10 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 18:10 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5030*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:09 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1113.eqiad.wmnet
* 18:03 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp1113*<nowiki>}</nowiki> and A:cp
* 18:02 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5023*<nowiki>}</nowiki> and A:cp
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqiad
* 18:01 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-upload_eqsin
* 18:01 sukhe: sre.cdn.roll-reboot cookbooks stalled due to icinga reboot
* 18:00 sukhe@cumin1003: END (ERROR) - Cookbook sre.cdn.roll-reboot (exit_code=97) rolling reboot on A:cp-text_eqsin
* 17:35 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1043.eqiad.wmnet
* 17:31 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1110.eqiad.wmnet [reason: manually pooling after reboot as icinga was down]
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1042.eqiad.wmnet
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:30 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:29 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1111.eqiad.wmnet
* 17:28 sukhe: sukhe@alert1002:~$ sudo systemctl restart icinga.service
* 17:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92903 and previous config saved to /var/cache/conftool/dbconfig/20260525-171310-fceratto.json
* 17:11 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1042.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 17:06 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92902 and previous config saved to /var/cache/conftool/dbconfig/20260525-170302-fceratto.json
* 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P92901 and previous config saved to /var/cache/conftool/dbconfig/20260525-165255-fceratto.json
* 16:51 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1042.eqiad.wmnet
* 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92900 and previous config saved to /var/cache/conftool/dbconfig/20260525-164247-fceratto.json
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1041.eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:41 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1041.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:40 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5021.eqsin.wmnet
* 16:39 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5029.eqsin.wmnet
* 16:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92899 and previous config saved to /var/cache/conftool/dbconfig/20260525-163559-fceratto.json
* 16:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
* 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92898 and previous config saved to /var/cache/conftool/dbconfig/20260525-163512-fceratto.json
* 16:34 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1108.eqiad.wmnet
* 16:30 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1109.eqiad.wmnet
* 16:26 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92897 and previous config saved to /var/cache/conftool/dbconfig/20260525-162505-fceratto.json
* 16:20 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1041.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1040.eqiad.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:16 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1040.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249', diff saved to https://phabricator.wikimedia.org/P92896 and previous config saved to /var/cache/conftool/dbconfig/20260525-161457-fceratto.json
* 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92895 and previous config saved to /var/cache/conftool/dbconfig/20260525-160450-fceratto.json
* 16:02 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2249 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92894 and previous config saved to /var/cache/conftool/dbconfig/20260525-155930-fceratto.json
* 15:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2249.codfw.wmnet with reason: Maintenance
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5020.eqsin.wmnet
* 15:57 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5028.eqsin.wmnet
* 15:52 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1106.eqiad.wmnet
* 15:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1107.eqiad.wmnet
* 15:29 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1040.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1039.eqiad.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:27 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1039.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 15:17 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1013 from dbctl [[phab:T427190|T427190]]', diff saved to https://phabricator.wikimedia.org/P92893 and previous config saved to /var/cache/conftool/dbconfig/20260525-151718-marostegui.json
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5019.eqsin.wmnet
* 15:15 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5027.eqsin.wmnet
* 15:12 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1104.eqiad.wmnet
* 15:11 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1105.eqiad.wmnet
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92892 and previous config saved to /var/cache/conftool/dbconfig/20260525-150309-fceratto.json
* 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92891 and previous config saved to /var/cache/conftool/dbconfig/20260525-145301-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P92890 and previous config saved to /var/cache/conftool/dbconfig/20260525-144253-fceratto.json
* 14:33 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1102.eqiad.wmnet
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92889 and previous config saved to /var/cache/conftool/dbconfig/20260525-143246-fceratto.json
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5026.eqsin.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5018.eqsin.wmnet
* 14:31 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1103.eqiad.wmnet
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92888 and previous config saved to /var/cache/conftool/dbconfig/20260525-142551-fceratto.json
* 14:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92887 and previous config saved to /var/cache/conftool/dbconfig/20260525-142520-fceratto.json
* 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92885 and previous config saved to /var/cache/conftool/dbconfig/20260525-141513-fceratto.json
* 14:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 14:06 sukhe: curl localhost:9090/pools/inference-staging-grpc_30051 shows ml-staging200[1-3].codfw.wmnet as enabled and pooled: [[phab:T424049|T424049]]
* 14:05 sukhe: sukhe@lvs2013:~$ sudo systemctl restart pybal.service: [[phab:T424049|T424049]]
* 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P92884 and previous config saved to /var/cache/conftool/dbconfig/20260525-140505-fceratto.json
* 14:03 sukhe: sudo cumin 'A:lvs and A:lvs-low-traffic-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service": [[phab:T424049|T424049]]
* 14:02 sukhe: sukhe@lvs2014:~$ sudo systemctl restart pybal.service
* 14:00 sukhe: sudo cumin 'A:lvs and A:lvs-secondary-codfw' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]"'
* 13:59 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1039.eqiad.wmnet
* 13:58 sukhe: sudo cumin 'A:lvs and A:eqiad' 'run-puppet-agent --enable "adding new ml-serve (grpc) [[phab:T424049|T424049]]": NOOP change, since service is codfw only
* 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92882 and previous config saved to /var/cache/conftool/dbconfig/20260525-135458-fceratto.json
* 13:52 Msz2001: Everything deployed, UTC afternoon config+backport window done
* 13:52 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] (duration: 09m 43s)
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1101.eqiad.wmnet
* 13:51 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp1100.eqiad.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5025.eqsin.wmnet
* 13:50 sukhe@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:49 kart_: Updated Recommendation API to 2026-05-21-044522-production
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92881 and previous config saved to /var/cache/conftool/dbconfig/20260525-134807-fceratto.json
* 13:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 13:47 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:47 kartik@deploy1003: helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92880 and previous config saved to /var/cache/conftool/dbconfig/20260525-134737-fceratto.json
* 13:45 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: Reboot
* 13:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293119{{!}}Set $wgAutoconfirmCount to 25 on plwiktionary (T427177)]]
* 13:40 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqiad
* 13:39 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqiad
* 13:38 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] (duration: 08m 14s)
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_eqsin
* 13:38 sukhe@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_eqsin
* 13:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92878 and previous config saved to /var/cache/conftool/dbconfig/20260525-133729-fceratto.json
* 13:34 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:33 kartik@deploy1003: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1038.eqiad.wmnet
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:32 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:31 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290813{{!}}Article Guidance: enable experiment on phase 2 wikis (T426871)]]
* 13:27 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] (duration: 07m 43s)
* 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P92876 and previous config saved to /var/cache/conftool/dbconfig/20260525-132722-fceratto.json
* 13:23 mszwarc@deploy1003: mszwarc, jhsoby: Continuing with deployment
* 13:21 mszwarc@deploy1003: mszwarc, jhsoby: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1038.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 13:20 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1293094{{!}}Update plwikimedia logo to monochrome, following on-wiki change (T427193)]], [[gerrit:1290953{{!}}Update logo, wordmark and tagline for zghwiki (T426406)]]
* 13:19 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] (duration: 15m 53s)
* 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92875 and previous config saved to /var/cache/conftool/dbconfig/20260525-131714-fceratto.json
* 13:12 mszwarc@deploy1003: vadymts1, mszwarc: Continuing with deployment
* 13:12 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92873 and previous config saved to /var/cache/conftool/dbconfig/20260525-131023-fceratto.json
* 13:10 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92872 and previous config saved to /var/cache/conftool/dbconfig/20260525-130950-fceratto.json
* 13:07 mszwarc@deploy1003: vadymts1, mszwarc: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1291966{{!}}Modify various configurations for English Wikibooks (T426992)]]
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1162: Reboot
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92870 and previous config saved to /var/cache/conftool/dbconfig/20260525-125942-fceratto.json
* 12:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reboot
* 12:59 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:58 kart_: Updated cxserver to 2026-05-24-103047-production ([[phab:T426808|T426808]], [[phab:T373418|T373418]])
* 12:56 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:54 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1162: Reboot
* 12:54 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reboot
* 12:54 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:53 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1162.eqiad.wmnet with reason: Reboot
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P92868 and previous config saved to /var/cache/conftool/dbconfig/20260525-124934-fceratto.json
* 12:40 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:39 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:39 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1038.eqiad.wmnet
* 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92867 and previous config saved to /var/cache/conftool/dbconfig/20260525-123927-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92866 and previous config saved to /var/cache/conftool/dbconfig/20260525-123239-fceratto.json
* 12:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92865 and previous config saved to /var/cache/conftool/dbconfig/20260525-123208-fceratto.json
* 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92864 and previous config saved to /var/cache/conftool/dbconfig/20260525-122201-fceratto.json
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1037.eqiad.wmnet
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:17 jiji@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P92863 and previous config saved to /var/cache/conftool/dbconfig/20260525-121153-fceratto.json
* 12:10 jiji@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mc1037.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1003"
* 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92862 and previous config saved to /var/cache/conftool/dbconfig/20260525-120145-fceratto.json
* 11:58 jiji@cumin1003: START - Cookbook sre.dns.netbox
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92861 and previous config saved to /var/cache/conftool/dbconfig/20260525-115504-fceratto.json
* 11:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92860 and previous config saved to /var/cache/conftool/dbconfig/20260525-115434-fceratto.json
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92859 and previous config saved to /var/cache/conftool/dbconfig/20260525-114426-fceratto.json
* 11:43 jiji@cumin1003: START - Cookbook sre.hosts.decommission for hosts mc1037.eqiad.wmnet
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P92858 and previous config saved to /var/cache/conftool/dbconfig/20260525-113419-fceratto.json
* 11:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2160.codfw.wmnet with OS trixie
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92857 and previous config saved to /var/cache/conftool/dbconfig/20260525-112411-fceratto.json
* 11:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92856 and previous config saved to /var/cache/conftool/dbconfig/20260525-111717-fceratto.json
* 11:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92855 and previous config saved to /var/cache/conftool/dbconfig/20260525-111648-fceratto.json
* 11:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92854 and previous config saved to /var/cache/conftool/dbconfig/20260525-110640-fceratto.json
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 11:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2160.codfw.wmnet with reason: host reimage
* 10:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 10:57 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 10:56 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 10:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P92853 and previous config saved to /var/cache/conftool/dbconfig/20260525-105633-fceratto.json
* 10:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92852 and previous config saved to /var/cache/conftool/dbconfig/20260525-104625-fceratto.json
* 10:43 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2160.codfw.wmnet with OS trixie
* 10:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92851 and previous config saved to /var/cache/conftool/dbconfig/20260525-104141-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to pc3 as master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92850 and previous config saved to /var/cache/conftool/dbconfig/20260525-104055-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1023 to dbctl', diff saved to https://phabricator.wikimedia.org/P92849 and previous config saved to /var/cache/conftool/dbconfig/20260525-104027-marostegui.json
* 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92848 and previous config saved to /var/cache/conftool/dbconfig/20260525-103944-fceratto.json
* 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 10:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
* 10:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
* 10:27 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:16 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1011.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.eqiad.wmnet
* 09:59 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1006.eqiad.wmnet
* 09:57 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.eqiad.wmnet
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:28 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92847 and previous config saved to /var/cache/conftool/dbconfig/20260525-091302-fceratto.json
* 09:12 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92846 and previous config saved to /var/cache/conftool/dbconfig/20260525-090255-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231', diff saved to https://phabricator.wikimedia.org/P92845 and previous config saved to /var/cache/conftool/dbconfig/20260525-085247-fceratto.json
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92844 and previous config saved to /var/cache/conftool/dbconfig/20260525-084239-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2231 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92843 and previous config saved to /var/cache/conftool/dbconfig/20260525-083540-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2231.codfw.wmnet with reason: Maintenance
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92842 and previous config saved to /var/cache/conftool/dbconfig/20260525-083511-fceratto.json
* 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92841 and previous config saved to /var/cache/conftool/dbconfig/20260525-082504-fceratto.json
* 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215', diff saved to https://phabricator.wikimedia.org/P92840 and previous config saved to /var/cache/conftool/dbconfig/20260525-081456-fceratto.json
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92839 and previous config saved to /var/cache/conftool/dbconfig/20260525-080448-fceratto.json
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2215 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92838 and previous config saved to /var/cache/conftool/dbconfig/20260525-075739-fceratto.json
* 07:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2215.codfw.wmnet with reason: Maintenance
* 07:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92837 and previous config saved to /var/cache/conftool/dbconfig/20260525-075708-fceratto.json
* 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92836 and previous config saved to /var/cache/conftool/dbconfig/20260525-074700-fceratto.json
* 07:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196', diff saved to https://phabricator.wikimedia.org/P92835 and previous config saved to /var/cache/conftool/dbconfig/20260525-073653-fceratto.json
* 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92834 and previous config saved to /var/cache/conftool/dbconfig/20260525-072645-fceratto.json
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2196 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92833 and previous config saved to /var/cache/conftool/dbconfig/20260525-071953-fceratto.json
* 07:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2196.codfw.wmnet with reason: Maintenance
* 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92832 and previous config saved to /var/cache/conftool/dbconfig/20260525-071924-fceratto.json
* 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92831 and previous config saved to /var/cache/conftool/dbconfig/20260525-070917-fceratto.json
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2233.codfw.wmnet with OS trixie
* 06:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186', diff saved to https://phabricator.wikimedia.org/P92830 and previous config saved to /var/cache/conftool/dbconfig/20260525-065909-fceratto.json
* 06:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92829 and previous config saved to /var/cache/conftool/dbconfig/20260525-064902-fceratto.json
* 06:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2186 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92828 and previous config saved to /var/cache/conftool/dbconfig/20260525-064305-fceratto.json
* 06:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2233.codfw.wmnet with reason: host reimage
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2233.codfw.wmnet with OS trixie
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2233.codfw.wmnet with reason: Reimage to Trixie
* 06:17 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 06:17 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2160.codfw.wmnet with reason: Reboot upgrade m2
* 06:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2233.codfw.wmnet with reason: Reboot upgrade m2
* 06:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1027.eqiad.wmnet with reason: Reboot
* 05:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2023.codfw.wmnet,pc[1013,1023].eqiad.wmnet with reason: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013.eqiad.wmnet: Maintenance on pc3
* 05:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:17 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1013.eqiad.wmnet: Maintenance on pc3
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 43s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-24 ==
* 19:08 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-23 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-22 ==
* 23:39 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:39 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 23:38 arlolra@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 23:37 arlolra@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 22:20 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 22:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:29 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_eqiad: [[phab:T426585|T426585]] - bking@cumin2002
* 20:28 inflatador: bking@deploy1003 set eqiad prod cirrus `node_concurrent_recoveries` up to 7 from 4 [[phab:T426585|T426585]]
* 20:27 inflatador: bking@deploy1003 set codfw prod cirrus `node_concurrent_recoveries` back down to 4 from 7 [[phab:T426585|T426585]]
* 18:39 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 17:34 topranks: enable ttl protection on esams CRs IBGP session
* 17:28 topranks: enable ttl protection on ulsfo CRs IBGP session
* 16:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:12 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:58 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:15 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:14 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:02 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2008-dev.codfw.wmnet
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2008-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb[1020,1022-1025].eqiad.wmnet
* 14:29 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:26 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 14:23 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2008-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudnet2007-dev.codfw.wmnet
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:23 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 14:03 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudnet2007-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 13:59 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb[1020,1022-1025].eqiad.wmnet
* 13:58 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 13:53 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudnet2007-dev.codfw.wmnet
* 13:52 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1018.eqiad.wmnet
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 13:46 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1018.eqiad.wmnet
* 13:25 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for 6 hosts
* 13:16 inflatador: bking@deploy1002 set search_codfw cluster recovery settings from 4 to 7 [[phab:T426560|T426560]]
* 13:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for 6 hosts
* 13:15 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 13:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 13:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp5017.eqsin.wmnet
* 13:10 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet
* 13:09 elukey: uploaded spicerack_12.6.0 to apt.wikimedia.org bookworm-wikimedia
* 13:08 fnegri@cumin1003: END (FAIL) - Cookbook sre.mysql.upgrade (exit_code=99) for clouddb1017.eqiad.wmnet
* 12:59 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp5017.eqsin.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:57 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3081.esams.wmnet
* 12:54 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:41 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3080.esams.wmnet
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 12:03 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp308[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3073.esams.wmnet
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:28 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Migration of db2154.codfw.wmnet completed
* 11:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3072.esams.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 11:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 11:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1172: Migration of db1172.eqiad.wmnet completed
* 11:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[2-3].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 11:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 11:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3079.esams.wmnet
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 10:55 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1058.eqiad.wmnet to cluster eqiad and group C
* 10:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 10:43 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:42 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:42 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2154: Migration of db2154.codfw.wmnet completed
* 10:42 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 10:37 moritzm: remove ganeti1024 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 10:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2154.codfw.wmnet with OS trixie
* 10:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2010.codfw.wmnet with OS trixie
* 10:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 10:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1172: Migration of db1172.eqiad.wmnet completed
* 10:19 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3078.esams.wmnet
* 10:18 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:16 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS trixie
* 10:15 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1017.eqiad.wmnet
* 10:13 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2154.codfw.wmnet with reason: host reimage
* 10:07 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 10:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3071.esams.wmnet
* 09:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:56 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2154.codfw.wmnet with OS trixie
* 09:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
* 09:51 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 09:39 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2154: Upgrading db2154.codfw.wmnet
* 09:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2154: Upgrading db2154.codfw.wmnet
* 09:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:38 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS trixie
* 09:35 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1172: Upgrading db1172.eqiad.wmnet
* 09:34 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2009.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2009.codfw.wmnet with OS trixie
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3070.esams.wmnet
* 09:21 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:16 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 09:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[0-1].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 09:11 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3077.esams.wmnet
* 09:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:03 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 08:47 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:46 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 08:40 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:33 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:30 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3076.esams.wmnet
* 08:18 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp307[6-7].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:15 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:15 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti1058.eqiad.wmnet on all recursors
* 08:15 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ganeti1058 - cmooney@cumin1003"
* 08:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:07 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 08:07 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3069.esams.wmnet
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:05 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 07:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 07:26 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3068.esams.wmnet
* 07:14 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp306[8-9].esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 07:10 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3075.esams.wmnet
* 07:06 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1057.eqiad.wmnet to cluster eqiad and group A
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1057
* 07:01 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1057
* 06:58 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3075.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3067.esams.wmnet
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 06:46 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3067.esams.wmnet<nowiki>}</nowiki> and A:cp
* 06:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 06:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast3007.wikimedia.org
* 06:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast3007.wikimedia.org
* 05:25 marostegui@dns1004: END - running authdns-update
* 05:24 marostegui@dns1004: START - running authdns-update
* 05:23 marostegui: Failover m5-master [[phab:T426633|T426633]]
* 05:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1028.eqiad.wmnet with reason: Reboot
* 05:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy2005.codfw.wmnet with reason: Reboot
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1012.eqiad.wmnet
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:11 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1012.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 05:03 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 04:56 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1012.eqiad.wmnet
== 2026-05-21 ==
* 23:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] (duration: 06m 42s)
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 23:38 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified
* 23:36 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290954{{!}}Drop not defined config $wgAllowRawHtmlCopyrightMessages]], [[gerrit:1290957{{!}}Drop $wgGraphShowInToolbar definition as unused]], [[gerrit:1290958{{!}}Drop wgMFSearchGenerator definition as unused]], [[gerrit:1290960{{!}}Drop unused wpReportIncidentLocalLinks]]
* 22:26 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2002.codfw.wmnet with OS trixie
* 22:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:03 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2002.codfw.wmnet with reason: host reimage
* 22:02 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:49 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 21:44 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2002.codfw.wmnet with OS trixie
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:25 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:20 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:19 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:26 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 20:16 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 19:22 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase
* 19:10 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:59 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:53 papaul: rebooting msw1-codfw
* 18:50 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 18:39 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:52 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:52 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:51 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:50 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:49 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:48 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:46 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:43 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:42 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:41 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:40 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:39 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 17:39 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:38 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cp6015.drmrs.wmnet with reason: hardware down
* 17:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 17:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:36 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:23 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:22 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:22 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:13 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1016.eqiad.wmnet
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:09 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:09 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92810 and previous config saved to /var/cache/conftool/dbconfig/20260521-170823-ladsgroup.json
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2031.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2030.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2029.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wdqs2028.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2029
* 16:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2028
* 16:55 papaul: rebooting msw-d3-codfw
* 16:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:52 papaul: rebooting msw-c7-codfw
* 16:51 papaul: rebooting msw-c6-codfw
* 16:48 papaul: rebooting msw-b7-codfw
* 16:48 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:45 fnegri@cumin1003: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:43 papaul: rebooting msw-b6-codfw
* 16:40 papaul: rebooting msw-a1-codfw
* 16:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:37 fnegri@cumin1003: START - Cookbook sre.mysql.upgrade for clouddb1014.eqiad.wmnet
* 16:37 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs2031
* 16:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2031
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2030
* 16:35 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2029
* 16:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2028
* 16:34 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:33 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs2028 to codfw - jhancock@cumin2002"
* 16:26 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc1022.eqiad.wmnet with reason: Move to nftables
* 16:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2022.codfw.wmnet with reason: Move to nftables
* 16:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2048: Repooling
* 16:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool pc2 ([[phab:T421705|T421705]])', diff saved to https://phabricator.wikimedia.org/P92807 and previous config saved to /var/cache/conftool/dbconfig/20260521-161808-ladsgroup.json
* 16:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:15 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:05 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 16:02 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:52 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:42 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es2048: Repooling
* 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92804 and previous config saved to /var/cache/conftool/dbconfig/20260521-154108-fceratto.json
* 15:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:34 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92803 and previous config saved to /var/cache/conftool/dbconfig/20260521-153400-fceratto.json
* 15:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
* 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92802 and previous config saved to /var/cache/conftool/dbconfig/20260521-153331-fceratto.json
* 15:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:25 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 15:24 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92801 and previous config saved to /var/cache/conftool/dbconfig/20260521-152323-fceratto.json
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 15:19 claime: Enabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 15:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 15:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P92800 and previous config saved to /var/cache/conftool/dbconfig/20260521-151316-fceratto.json
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
* 15:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 15:07 elukey@cumin1003: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:05 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:04 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] (duration: 10m 11s)
* 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92799 and previous config saved to /var/cache/conftool/dbconfig/20260521-150308-fceratto.json
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 15:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 15:00 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
* 15:00 elukey@cumin1003: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master
* 15:00 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.pki.restart-reboot (exit_code=0) rolling reboot on A:pki
* 14:57 claime: Disabling puppet on A:cp-text - [[phab:T426323|T426323]]
* 14:56 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
* 14:54 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-build1001.eqiad.wmnet
* 14:54 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1290805{{!}}hCaptcha: Enable for DiscussionTools on Group 0 wikis (T426039)]]
* 14:54 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
* 14:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1001.eqiad.wmnet
* 14:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1001.eqiad.wmnet
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92798 and previous config saved to /var/cache/conftool/dbconfig/20260521-145132-fceratto.json
* 14:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
* 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92797 and previous config saved to /var/cache/conftool/dbconfig/20260521-145103-fceratto.json
* 14:50 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-build1001.eqiad.wmnet
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2241: Migration of db2241.codfw.wmnet completed
* 14:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 14:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1001.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>ml-serve1001.eqiad.wmnet<nowiki>}</nowiki> and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
* 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
* 14:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1011.eqiad.wmnet
* 14:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1011.eqiad.wmnet
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92795 and previous config saved to /var/cache/conftool/dbconfig/20260521-144055-fceratto.json
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:37 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1011.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1011.eqiad.wmnet
* 14:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
* 14:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1010.eqiad.wmnet
* 14:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1010.eqiad.wmnet
* 14:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P92793 and previous config saved to /var/cache/conftool/dbconfig/20260521-143045-fceratto.json
* 14:30 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet. on all recursors
* 14:30 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet. on all recursors
* 14:29 elukey@cumin1003: START - Cookbook sre.pki.restart-reboot rolling reboot on A:pki
* 14:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 14:27 slyngshede@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-reboot (exit_code=1) rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 14:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 14:24 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1010.eqiad.wmnet
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
* 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92792 and previous config saved to /var/cache/conftool/dbconfig/20260521-142037-fceratto.json
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 14:17 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 14:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1010.eqiad.wmnet
* 14:14 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1009.eqiad.wmnet
* 14:14 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1009.eqiad.wmnet
* 14:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
* 14:12 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: repool after maintenance
* 14:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92789 and previous config saved to /var/cache/conftool/dbconfig/20260521-140906-fceratto.json
* 14:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 14:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92788 and previous config saved to /var/cache/conftool/dbconfig/20260521-140837-fceratto.json
* 14:08 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1009.eqiad.wmnet
* 14:08 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:07 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 14:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2241: Migration of db2241.codfw.wmnet completed
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1009.eqiad.wmnet
* 14:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1008.eqiad.wmnet
* 14:03 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1008.eqiad.wmnet
* 14:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2241.codfw.wmnet with OS trixie
* 13:59 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 13:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92786 and previous config saved to /var/cache/conftool/dbconfig/20260521-135830-fceratto.json
* 13:58 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1008.eqiad.wmnet
* 13:53 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1007.eqiad.wmnet
* 13:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1007.eqiad.wmnet
* 13:51 Lucas_WMDE: UTC afternoon backport+config window done
* 13:51 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] (duration: 07m 20s)
* 13:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92784 and previous config saved to /var/cache/conftool/dbconfig/20260521-134822-fceratto.json
* 13:48 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1007.eqiad.wmnet
* 13:47 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 13:46 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Continuing with deployment
* 13:45 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, migr: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes
* 13:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:44 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 13:43 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290743{{!}}composer.json: Updated symfony/yaml from 7.4.6 to 7.4.12 (T426861)]], [[gerrit:1289347{{!}}Skip init.test.js test if VisualEditor not installed (T426740)]], [[gerrit:1289342{{!}}fix: simplify to show only one icon type for password reveal (T419413)]]
* 13:43 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 13:43 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1007.eqiad.wmnet
* 13:42 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1006.eqiad.wmnet
* 13:42 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1006.eqiad.wmnet
* 13:41 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] (duration: 06m 52s)
* 13:41 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 13:40 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2241.codfw.wmnet with reason: host reimage
* 13:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 13:38 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 13:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92782 and previous config saved to /var/cache/conftool/dbconfig/20260521-133815-fceratto.json
* 13:37 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1006.eqiad.wmnet
* 13:37 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 13:37 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:36 dbrant@deploy1003: dbrant: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 13:35 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1290035{{!}}docroot: Remove non-wikipedias from digital asset links. (T426010 T385520)]]
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1006.eqiad.wmnet
* 13:32 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1005.eqiad.wmnet
* 13:32 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1005.eqiad.wmnet
* 13:31 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] (duration: 09m 11s)
* 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92781 and previous config saved to /var/cache/conftool/dbconfig/20260521-133116-fceratto.json
* 13:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92780 and previous config saved to /var/cache/conftool/dbconfig/20260521-133048-fceratto.json
* 13:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 13:27 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1005.eqiad.wmnet
* 13:27 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: repool after maintenance
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:25 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2241.codfw.wmnet with OS trixie
* 13:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 13:24 sbisson@deploy1003: sbisson: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2241: Upgrading db2241.codfw.wmnet
* 13:23 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 13:22 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1290014{{!}}Enable AG on phase 2 wikis (T426871)]]
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1005.eqiad.wmnet
* 13:22 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1004.eqiad.wmnet
* 13:22 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1004.eqiad.wmnet
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92778 and previous config saved to /var/cache/conftool/dbconfig/20260521-132041-fceratto.json
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] (duration: 11m 55s)
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 13:17 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 13:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1039: Repooling
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Continuing with deployment
* 13:15 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1004.eqiad.wmnet
* 13:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 13:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1004.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92776 and previous config saved to /var/cache/conftool/dbconfig/20260521-131033-fceratto.json
* 13:10 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1003.eqiad.wmnet
* 13:10 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1003.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:10 cwilliams@cumin1003: dbctl commit (dc=all): 'Depool db2241 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92775 and previous config saved to /var/cache/conftool/dbconfig/20260521-131025-cwilliams.json
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 13:10 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:10 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, neriah: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:09 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1290088{{!}}Disable wgUseFilePatrol in ukwiki (T426905)]], [[gerrit:1290032{{!}}Enable 'flood' user group at en.wikiversity (T426882)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 13:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 13:06 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3074.esams.wmnet
* 13:06 cwilliams@cumin1003: dbctl commit (dc=all): 'Promote db2162 to x3 primary [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92774 and previous config saved to /var/cache/conftool/dbconfig/20260521-130609-cwilliams.json
* 13:04 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:04 cezmunsta: Starting x3 codfw failover from db2241 to db2162 - [[phab:T426936|T426936]]
* 13:04 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1003.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 13:00 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92772 and previous config saved to /var/cache/conftool/dbconfig/20260521-130018-fceratto.json
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1003.eqiad.wmnet
* 12:59 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:59 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1002.eqiad.wmnet
* 12:59 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1002.eqiad.wmnet
* 12:58 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:57 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:56 cwilliams@cumin1003: dbctl commit (dc=all): 'Set db2162 with weight 0 [[phab:T426936|T426936]]', diff saved to https://phabricator.wikimedia.org/P92771 and previous config saved to /var/cache/conftool/dbconfig/20260521-125645-cwilliams.json
* 12:56 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 [[phab:T426936|T426936]]
* 12:56 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 12:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 12:54 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3074.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1002.eqiad.wmnet
* 12:54 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:54 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6008.drmrs.wmnet
* 12:53 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:52 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1018.eqiad.wmnet with reason: host reimage
* 12:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1002.eqiad.wmnet
* 12:49 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 12:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:48 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp3066.esams.wmnet
* 12:47 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92770 and previous config saved to /var/cache/conftool/dbconfig/20260521-124707-fceratto.json
* 12:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 12:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1039: Repooling
* 12:46 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 12:45 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:44 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:43 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:43 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] (duration: 07m 54s)
* 12:42 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92768 and previous config saved to /var/cache/conftool/dbconfig/20260521-124014-fceratto.json
* 12:39 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 12:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1018.eqiad.wmnet with OS trixie
* 12:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:36 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:36 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp3066.esams.wmnet<nowiki>}</nowiki> and A:cp
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1290727{{!}}hCaptcha: Finish group1 account creation rollout + itwiki/hewiki for mobile apps (T426045 T425354)]]
* 12:35 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:34 kart_: Updated cxserver to 2026-05-20-034002-production ([[phab:T388690|T388690]], [[phab:T404295|T404295]], [[phab:T391703|T391703]], [[phab:T426605|T426605]])
* 12:34 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 12:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 12:30 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:30 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 12:29 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92767 and previous config saved to /var/cache/conftool/dbconfig/20260521-122905-fceratto.json
* 12:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 12:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92766 and previous config saved to /var/cache/conftool/dbconfig/20260521-122839-fceratto.json
* 12:27 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:27 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:26 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-staging-worker
* 12:23 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2003.codfw.wmnet
* 12:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2003.codfw.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 12:21 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:21 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:21 moritzm: installing nginx security updates
* 12:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 12:20 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-staging-codfw: maintenance
* 12:19 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-staging-codfw: maintenance
* 12:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92765 and previous config saved to /var/cache/conftool/dbconfig/20260521-121832-fceratto.json
* 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2003.codfw.wmnet
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 12:15 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1017.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 12:13 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6007.drmrs.wmnet
* 12:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P92764 and previous config saved to /var/cache/conftool/dbconfig/20260521-120824-fceratto.json
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2003.codfw.wmnet
* 12:07 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2002.codfw.wmnet
* 12:07 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2002.codfw.wmnet
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 12:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[7-8].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 12:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6014.drmrs.wmnet
* 12:00 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1017.eqiad.wmnet with OS trixie
* 12:00 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2002.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
* 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92763 and previous config saved to /var/cache/conftool/dbconfig/20260521-115817-fceratto.json
* 11:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 11:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
* 11:51 taavi: disabling puppet on C:bird to roll out {{Gerrit|1289919}}
* 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92762 and previous config saved to /var/cache/conftool/dbconfig/20260521-115112-fceratto.json
* 11:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
* 11:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet
* 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92761 and previous config saved to /var/cache/conftool/dbconfig/20260521-115043-fceratto.json
* 11:50 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet
* 11:50 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt2002.wikimedia.org
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 11:45 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet
* 11:45 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp1001.eqiad.wmnet
* 11:44 kartik@deploy1003: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt2002.wikimedia.org
* 11:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 11:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92760 and previous config saved to /var/cache/conftool/dbconfig/20260521-114036-fceratto.json
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp1001.eqiad.wmnet
* 11:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testreduce1002.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 11:36 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 11:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 11:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet
* 11:35 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker
* 11:35 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:34 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host testreduce1002.eqiad.wmnet
* 11:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 11:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P92759 and previous config saved to /var/cache/conftool/dbconfig/20260521-113028-fceratto.json
* 11:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2014.codfw.wmnet
* 11:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 11:26 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2014.codfw.wmnet
* 11:20 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6013.drmrs.wmnet
* 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92758 and previous config saved to /var/cache/conftool/dbconfig/20260521-112021-fceratto.json
* 11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2013.codfw.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 11:09 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92757 and previous config saved to /var/cache/conftool/dbconfig/20260521-110851-fceratto.json
* 11:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
* 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92756 and previous config saved to /var/cache/conftool/dbconfig/20260521-110822-fceratto.json
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2013.codfw.wmnet
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 11:04 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6006.drmrs.wmnet
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 11:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92753 and previous config saved to /var/cache/conftool/dbconfig/20260521-105815-fceratto.json
* 10:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1016.eqiad.wmnet with reason: host reimage
* 10:54 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet
* 10:51 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:51 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:50 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036', diff saved to https://phabricator.wikimedia.org/P92752 and previous config saved to /var/cache/conftool/dbconfig/20260521-104807-fceratto.json
* 10:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet
* 10:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 10:44 jiji@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] (duration: 08m 02s)
* 10:43 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:40 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 10:40 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:39 jiji@deploy1003: jiji: Continuing with deployment
* 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92751 and previous config saved to /var/cache/conftool/dbconfig/20260521-103759-fceratto.json
* 10:37 jiji@deploy1003: jiji: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:36 jiji@deploy1003: Started scap sync-world: Backport for [[gerrit:1290709{{!}}ProductionServices.php: switch filebackend.php to rdb2011:6381 (T418261 T419976)]]
* 10:35 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 10:34 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 10:27 dcausse: [[phab:T423993|T423993]]: reindexing all archive indices
* 10:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92749 and previous config saved to /var/cache/conftool/dbconfig/20260521-102630-fceratto.json
* 10:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Maintenance
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92748 and previous config saved to /var/cache/conftool/dbconfig/20260521-102601-fceratto.json
* 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6005.drmrs.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92747 and previous config saved to /var/cache/conftool/dbconfig/20260521-101552-fceratto.json
* 10:15 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 10:14 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 10:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 10:12 moritzm: installing postgresql security updates
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[5-6].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 10:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 10:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon1003.wikimedia.org
* 10:09 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 10:08 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 10:08 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 10:07 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92746 and previous config saved to /var/cache/conftool/dbconfig/20260521-100545-fceratto.json
* 10:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 10:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 10:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 10:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 10:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon1003.wikimedia.org
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netmon2002.wikimedia.org
* 09:59 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-codfw
* 09:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2005.codfw.wmnet
* 09:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2005.codfw.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 09:56 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92745 and previous config saved to /var/cache/conftool/dbconfig/20260521-095536-fceratto.json
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1384.eqiad.wmnet
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netmon2002.wikimedia.org
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/webrequest-page-view-next: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2005.codfw.wmnet
* 09:52 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
* 09:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2004.codfw.wmnet
* 09:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2004.codfw.wmnet
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
* 09:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 09:49 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1384.eqiad.wmnet
* 09:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1383.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92744 and previous config saved to /var/cache/conftool/dbconfig/20260521-094829-fceratto.json
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92743 and previous config saved to /var/cache/conftool/dbconfig/20260521-094801-fceratto.json
* 09:47 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 09:47 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T426563|T426563]]
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2004.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2003.codfw.wmnet
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-master-eqiad
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1004.eqiad.wmnet
* 09:44 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1383.eqiad.wmnet
* 09:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1382.eqiad.wmnet
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 09:39 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 09:38 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1382.eqiad.wmnet
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1381.eqiad.wmnet
* 09:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2003.codfw.wmnet
* 09:38 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2002.codfw.wmnet
* 09:38 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2002.codfw.wmnet
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92742 and previous config saved to /var/cache/conftool/dbconfig/20260521-093754-fceratto.json
* 09:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1004.eqiad.wmnet
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1003.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 09:36 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 09:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6012.drmrs.wmnet
* 09:34 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 09:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 09:33 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1381.eqiad.wmnet
* 09:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1380.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2002.codfw.wmnet
* 09:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl2001.codfw.wmnet
* 09:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl2001.codfw.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet
* 09:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl1002.eqiad.wmnet
* 09:29 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 09:29 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=eqiad
* 09:29 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts.*,name=codfw
* 09:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 09:28 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P92741 and previous config saved to /var/cache/conftool/dbconfig/20260521-092746-fceratto.json
* 09:27 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1380.eqiad.wmnet
* 09:27 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1379.eqiad.wmnet
* 09:27 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 09:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 09:25 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 09:24 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts.*,name=codfw
* 09:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:23 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 09:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1002.eqiad.wmnet
* 09:22 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-eqiad
* 09:22 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1379.eqiad.wmnet
* 09:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1378.eqiad.wmnet
* 09:21 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl2001.codfw.wmnet
* 09:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-master-codfw
* 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1056.eqiad.wmnet to cluster eqiad and group A
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:18 moritzm: remove ganeti1023 foom eqiad Ganeti cluster [[phab:T424680|T424680]]
* 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92740 and previous config saved to /var/cache/conftool/dbconfig/20260521-091738-fceratto.json
* 09:16 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1378.eqiad.wmnet
* 09:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1377.eqiad.wmnet
* 09:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1376.eqiad.wmnet
* 09:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1036: Repooling
* 09:07 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1376.eqiad.wmnet
* 09:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1375.eqiad.wmnet
* 09:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92738 and previous config saved to /var/cache/conftool/dbconfig/20260521-090609-fceratto.json
* 09:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
* 09:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1375.eqiad.wmnet
* 09:01 btullis@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:55 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6011.drmrs.wmnet
* 08:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 08:47 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:44 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp601[1-2].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 08:42 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6004.drmrs.wmnet
* 08:37 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1036: Repooling
* 08:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92733 and previous config saved to /var/cache/conftool/dbconfig/20260521-082951-fceratto.json
* 08:29 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T426633|T426633]])', diff saved to https://phabricator.wikimedia.org/P92731 and previous config saved to /var/cache/conftool/dbconfig/20260521-081642-fceratto.json
* 08:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:02 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1256: Migration of db1256.eqiad.wmnet completed
* 08:01 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6003.drmrs.wmnet
* 08:00 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1256.eqiad.wmnet with OS trixie
* 07:52 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp600[3-4].drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:51 marostegui@dns1004: END - running authdns-update
* 07:50 marostegui@dns1004: START - running authdns-update
* 07:48 marostegui: Failover m3-master [[phab:T426633|T426633]]
* 07:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 07:46 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6010.drmrs.wmnet
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to plain
* 07:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:38 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1256.eqiad.wmnet with reason: host reimage
* 07:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6010.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:35 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp6002.drmrs.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster1005.eqiad.wmnet to drbd
* 07:24 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6002.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 07:24 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1256.eqiad.wmnet with OS trixie
* 07:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1256: Upgrading db1256.eqiad.wmnet
* 07:21 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to plain
* 07:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy1025.eqiad.wmnet with reason: Rebooting
* 07:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of dse-k8s-etcd1001.eqiad.wmnet to drbd
* 06:53 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:52 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to plain
* 06:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:42 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:40 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 06:39 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 06:34 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 06:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 06:24 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of aux-k8s-etcd1003.eqiad.wmnet to drbd
* 06:23 arnaudb@cumin1003: END (FAIL) - Cookbook sre.gerrit.reboot-gerrit (exit_code=99) Rebooting Gerrit on gerrit2003
* 06:22 arnaudb@cumin1003: START - Cookbook sre.gerrit.reboot-gerrit Rebooting Gerrit on gerrit2003
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui: Failover m2-master [[phab:T426633|T426633]]
* 06:13 marostegui@dns1004: START - running authdns-update
* 05:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1012 from dbctl [[phab:T426930|T426930]]', diff saved to https://phabricator.wikimedia.org/P92728 and previous config saved to /var/cache/conftool/dbconfig/20260521-053858-marostegui.json
* 05:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92727 and previous config saved to /var/cache/conftool/dbconfig/20260521-053000-marostegui.json
* 05:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc1022 to pc2 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92726 and previous config saved to /var/cache/conftool/dbconfig/20260521-052905-marostegui.json
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc1012.eqiad.wmnet with reason: Cloning
* 02:41 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on planet1003.eqiad.wmnet with reason: debug wip
* 02:11 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 29s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1027.eqiad.wmnet
* 01:22 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1027.eqiad.wmnet
* 00:55 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
== 2026-05-20 ==
* 23:32 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] (duration: 06m 37s)
* 23:28 ladsgroup@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 23:27 ladsgroup@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290076{{!}}Migrate Swedish to same preference values as other wikis (2/2) (T426880)]]
* 23:19 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] (duration: 08m 35s)
* 23:14 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:10 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290075{{!}}Migrate Swedish to same preference values as other wikis (1/2) (T426880)]]
* 23:07 Amir1: wikiadmin2023@10.64.48.159(svwiki)> delete from user_properties where up_value = '2' and up_property = 'thumbsize'; Query OK, 215 rows affected (0.018 sec) ([[phab:T426880|T426880]] and [[phab:T376152|T376152]])
* 23:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] (duration: 07m 09s)
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Continuing with deployment
* 23:02 ladsgroup@deploy1003: ladsgroup, krinkle: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1290086{{!}}errorpage: Fix unclosed bold tag in 404.php (T129433)]]
* 22:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2027.codfw.wmnet
* 22:36 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2026.codfw.wmnet
* 22:33 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 22:29 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2026.codfw.wmnet
* 22:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2017.codfw.wmnet
* 22:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2017.codfw.wmnet
* 22:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2020.codfw.wmnet
* 21:59 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:59 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6009.drmrs.wmnet
* 21:58 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp6001.drmrs.wmnet
* 21:58 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2020.codfw.wmnet
* 21:58 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2019.codfw.wmnet
* 21:55 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs
* 21:52 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2019.codfw.wmnet
* 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2018.codfw.wmnet
* 21:51 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 21:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 21:48 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6001.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:47 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp6009.drmrs.wmnet<nowiki>}</nowiki> and A:cp
* 21:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2018.codfw.wmnet
* 21:40 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1026.eqiad.wmnet
* 21:37 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 21:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1026.eqiad.wmnet
* 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1025.eqiad.wmnet
* 21:27 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1025.eqiad.wmnet
* 21:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_ulsfo and A:cp
* 21:26 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4052.ulsfo.wmnet
* 21:22 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 21:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1022.eqiad.wmnet
* 21:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
* 21:14 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1022.eqiad.wmnet
* 21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1021.eqiad.wmnet
* 21:08 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 21:07 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1021.eqiad.wmnet
* 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1020.eqiad.wmnet
* 21:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1019.eqiad.wmnet
* 21:00 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] (duration: 10m 45s)
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:54 dancy@deploy1003: codenamenoreste, dancy: Continuing with deployment
* 20:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1019.eqiad.wmnet
* 20:53 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1018.eqiad.wmnet
* 20:53 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
* 20:53 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 20:52 dancy@deploy1003: codenamenoreste, dancy: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1287433{{!}}Restrict the changetags user right to bots and sysops on mediawiki.org (T355445)]]
* 20:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4051.ulsfo.wmnet
* 20:46 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1018.eqiad.wmnet
* 20:46 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1017.eqiad.wmnet
* 20:39 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1016.eqiad.wmnet
* 20:37 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:36 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 20:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
* 20:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:33 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1016.eqiad.wmnet
* 20:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1015.eqiad.wmnet
* 20:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1014.eqiad.wmnet
* 20:21 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 20:20 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1014.eqiad.wmnet
* 20:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1013.eqiad.wmnet
* 20:19 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:18 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:16 dwisehaupt@dns1005: END - running authdns-update
* 20:16 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:15 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs2012.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:15 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Maintenance
* 20:15 dwisehaupt@dns1005: START - running authdns-update
* 20:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1013.eqiad.wmnet
* 20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1012.eqiad.wmnet
* 20:13 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 20:13 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4044.ulsfo.wmnet
* 20:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:12 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 20:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4050.ulsfo.wmnet
* 20:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1012.eqiad.wmnet
* 20:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1011.eqiad.wmnet
* 20:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 20:02 brouberol@cumin1003: START - Cookbook sre.hosts.provision for host kafka-jumbo1016.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 20:01 brouberol@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1011.eqiad.wmnet
* 19:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 19:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 19:39 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 19:39 ejegg@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] (duration: 30m 19s)
* 19:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4043.ulsfo.wmnet
* 19:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4049.ulsfo.wmnet
* 19:27 ejegg@deploy1003: ejegg: Continuing with deployment
* 19:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 19:26 ejegg@deploy1003: ejegg: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 19:09 ejegg@deploy1003: Started scap sync-world: Backport for [[gerrit:1290012{{!}}Restore mistakenly-deleted messages (T111677)]], [[gerrit:1290013{{!}}Restore translations of mistakenly deleted messages (T111677)]]
* 19:03 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 19:02 brouberol@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 18:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4042.ulsfo.wmnet
* 18:49 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] (duration: 07m 28s)
* 18:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4048.ulsfo.wmnet
* 18:45 reedy@deploy1003: reedy: Continuing with deployment
* 18:43 reedy@deploy1003: reedy: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:41 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1290037{{!}}Update symfony/* (T426861)]]
* 18:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 18:38 dwisehaupt@dns1004: END - running authdns-update
* 18:37 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1016.eqiad.wmnet with OS trixie
* 18:36 dwisehaupt@dns1004: START - running authdns-update
* 18:26 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 18:11 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 18:09 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4041.ulsfo.wmnet
* 18:07 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4047.ulsfo.wmnet
* 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2025.codfw.wmnet
* 18:01 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 18:01 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T416562|T416562]]]
* 17:57 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2025.codfw.wmnet
* 17:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2022.codfw.wmnet
* 17:56 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 17:56 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:56 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6
* 17:51 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2022.codfw.wmnet
* 17:51 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2021.codfw.wmnet
* 17:45 swfrench@deploy1003: Finished scap sync-world: Rebuild to pick up new production image (duration: 28m 32s)
* 17:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2021.codfw.wmnet
* 17:45 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2015.codfw.wmnet
* 17:43 hnowlan: disabled puppet on grafana* to temporarily fix file ownership issue on /etc/grafana/provisioning/plugins/mahendrapaipuri-dashboardreporter-app.yaml
* 17:42 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 17:39 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2015.codfw.wmnet
* 17:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2014.codfw.wmnet
* 17:32 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2013.codfw.wmnet
* 17:28 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-codfw
* 17:28 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2003.codfw.wmnet
* 17:28 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2003.codfw.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 17:27 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4040.ulsfo.wmnet
* 17:26 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2013.codfw.wmnet
* 17:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2012.codfw.wmnet
* 17:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2003.codfw.wmnet
* 17:20 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 17:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2012.codfw.wmnet
* 17:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2011.codfw.wmnet
* 17:19 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2003.codfw.wmnet
* 17:17 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2002.codfw.wmnet
* 17:17 swfrench@deploy1003: Started scap sync-world: Rebuild to pick up new production image
* 17:14 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mr1-ulsfo,mr1-ulsfo IPv6,mr1-ulsfo.oob,mr1-ulsfo.oob IPv6 with reason: switch refresh
* 17:13 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2011.codfw.wmnet
* 17:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2010.codfw.wmnet
* 17:10 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2002.codfw.wmnet
* 17:10 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:10 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt
* 17:08 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:06 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2010.codfw.wmnet
* 17:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2008.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2002.codfw.wmnet
* 17:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker2001.codfw.wmnet
* 17:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker2001.codfw.wmnet
* 17:04 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 17:04 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2008.codfw.wmnet
* 17:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2007.codfw.wmnet
* 16:59 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker2001.codfw.wmnet
* 16:58 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:54 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:53 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2007.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker2001.codfw.wmnet
* 16:53 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-codfw
* 16:50 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 16:49 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 16:47 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 16:43 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:39 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:38 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 16:37 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 16:37 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_ulsfo and A:cp
* 16:36 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_ulsfo and not P<nowiki>{</nowiki>cp4037.ulsfo.wmnet<nowiki>}</nowiki> and not P<nowiki>{</nowiki>cp4038.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 16:34 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 16:33 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1325-1327].eqiad.wmnet
* 16:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:dse-k8s-worker-eqiad
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:29 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:20 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 16:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1322-1324].eqiad.wmnet
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:19 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 16:12 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1257: Migration of db1257.eqiad.wmnet completed
* 16:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr4-ulsfo,cr4-ulsfo IPv6,cr4-ulsfo.mgmt with reason: switch refresh
* 16:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1318-1321].eqiad.wmnet
* 16:10 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] (duration: 09m 12s)
* 16:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 16:07 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 16:07 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:07 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt
* 16:05 urbanecm@deploy1003: urbanecm, mszwarc: Continuing with deployment
* 16:05 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 16:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 16:02 urbanecm@deploy1003: urbanecm, mszwarc: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 16:00 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1289980{{!}}Fix newFromUserIdentity calls with interwiki users (T426832)]]
* 16:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1314-1317].eqiad.wmnet
* 15:59 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:59 btullis@cumin1003: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas (exit_code=0) rolling reboot on A:schema
* 15:57 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:56 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:54 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 15:52 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1027.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1023.eqiad.wmnet
* 15:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1024.eqiad.wmnet
* 15:51 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 15:51 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and not P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and not A:magru and (A:dnsbox)
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:50 brett@cumin2002: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 15:50 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:49 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1310-1313].eqiad.wmnet
* 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 15:48 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:46 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp[2041-2042].codfw.wmnet
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:46 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:46 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2041-2042].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1027.eqiad.wmnet
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1023.eqiad.wmnet
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1026.eqiad.wmnet
* 15:45 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 15:45 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs1024.eqiad.wmnet
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 15:44 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
* 15:44 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:44 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1015.eqiad.wmnet with reason: host reimage
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2024.codfw.wmnet
* 15:44 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2023.codfw.wmnet
* 15:43 btullis@cumin1003: START - Cookbook sre.misc-clusters.roll-restart-reboot-eventschemas rolling reboot on A:schema
* 15:41 brett@cumin2002: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 15:41 brett@cumin2002: START - Cookbook sre.dns.roll-reboot rolling reboot on P<nowiki>{</nowiki>dns6002.wikimedia.org<nowiki>}</nowiki> and (A:dnsbox)
* 15:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2016.codfw.wmnet
* 15:39 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:38 brett@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2024.codfw.wmnet
* 15:37 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2023.codfw.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1306-1309].eqiad.wmnet
* 15:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:35 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2016.codfw.wmnet
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2041-2042].codfw.wmnet
* 15:32 atsuko@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 15:30 atsuko@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 15:30 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1015.eqiad.wmnet with OS trixie
* 15:29 moritzm: failover Ganeti master in codfw02 to ganeti2033
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:27 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99)
* 15:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1257: Migration of db1257.eqiad.wmnet completed
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1302-1305].eqiad.wmnet
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:25 bking@cumin2002: START - Cookbook sre.wdqs.reboot
* 15:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1257.eqiad.wmnet with OS trixie
* 15:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 15:20 hashar: Restarted Jenkins CI due to Java upgrade which causes integration/pipelinelib to not be loadable.
* 15:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1298-1301].eqiad.wmnet
* 15:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:09 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=codfw
* 15:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1026.eqiad.wmnet
* 15:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:08 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1025.eqiad.wmnet
* 15:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:04 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1257.eqiad.wmnet with reason: host reimage
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1294-1297].eqiad.wmnet
* 15:00 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1294-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 15:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 14:57 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 14:57 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1014.eqiad.wmnet with reason: host reimage
* 14:56 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:56 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1025.eqiad.wmnet
* 14:54 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1024.eqiad.wmnet
* 14:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 14:51 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:50 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:49 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1257.eqiad.wmnet with OS trixie
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 14:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 14:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1257: Upgrading db1257.eqiad.wmnet
* 14:44 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1257: Upgrading db1257.eqiad.wmnet
* 14:43 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo,cr3-ulsfo IPv6,cr3-ulsfo.mgmt with reason: switch refresh
* 14:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:43 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 moritzm: installing rsync security updates
* 14:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 14:42 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1014.eqiad.wmnet with OS trixie
* 14:42 pt1979@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:42 pt1979@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: router upgrade, [[phab:T416562|T416562]]]
* 14:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
* 14:40 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 14:39 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 14:38 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1024.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:36 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1019.eqiad.wmnet
* 14:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 14:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 14:28 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 14:27 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test2001.codfw.wmnet with reason: host reimage
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 14:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:22 dancy@deploy1003: Installation of scap version "4.266.0" completed for 2 hosts
* 14:22 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: [[phab:T426560|T426560]] - bking@cumin2002
* 14:21 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 dancy@deploy1003: Installing scap version "4.266.0" for 2 host(s)
* 14:19 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 14:17 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1010.eqiad.wmnet
* 14:16 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS trixie
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1019.eqiad.wmnet
* 14:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1018.eqiad.wmnet
* 14:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 14:12 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:11 bjensen: uploaded trixie-packaged memkeys on apt1002
* 14:10 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:10 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 14:09 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:08 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:08 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:08 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 14:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Migration of db1258.eqiad.wmnet completed
* 14:04 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1013.eqiad.wmnet with reason: host reimage
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1018.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1017.eqiad.wmnet
* 14:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1010.eqiad.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 14:00 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5].eqiad.wmnet<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1017.eqiad.wmnet
* 13:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:56 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1016.eqiad.wmnet
* 13:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:53 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 13:53 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:52 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 13:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1011.eqiad.wmnet
* 13:49 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 13:49 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1013.eqiad.wmnet with OS trixie
* 13:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1016.eqiad.wmnet
* 13:48 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1015.eqiad.wmnet
* 13:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 13:47 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:43 Lucas_WMDE: UTC afternoon backport+config window done
* 13:43 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1011.eqiad.wmnet
* 13:43 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] (duration: 06m 39s)
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 13:41 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1015.eqiad.wmnet
* 13:41 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:41 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1014.eqiad.wmnet
* 13:38 reedy@deploy1003: reedy: Continuing with deployment
* 13:38 moritzm: installing krb5 security updates
* 13:38 reedy@deploy1003: reedy: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:37 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1056
* 13:36 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1289953{{!}}composer.json: Upgrading symfony/yaml (v7.4.6 => v7.4.12) (T426845)]]
* 13:35 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1056
* 13:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2183
* 13:31 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 13:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:28 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
* 13:27 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2183.codfw.wmnet 6.0.192.10.in-addr.arpa 6.0.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:27 root@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 root@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host db2183 - root@cumin1003"
* 13:26 arthurtaylor@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] (duration: 07m 26s)
* 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:25 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: [[phab:T426560|T426560]] - bking@cumin2002
* 13:24 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:23 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:22 root@cumin1003: START - Cookbook sre.dns.netbox
* 13:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1014.eqiad.wmnet
* 13:22 arthurtaylor@deploy1003: arthurtaylor: Continuing with deployment
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:22 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1013.eqiad.wmnet
* 13:20 arthurtaylor@deploy1003: arthurtaylor: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:19 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:19 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Migration of db1258.eqiad.wmnet completed
* 13:19 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1012.eqiad.wmnet with reason: host reimage
* 13:18 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2183
* 13:18 arthurtaylor@deploy1003: Started scap sync-world: Backport for [[gerrit:1289736{{!}}Disable support for PHP-serialized EntityData on Beta / Test Wikidata (T98035)]]
* 13:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 13:18 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 13:18 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS trixie
* 13:18 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:18 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.update-replication (exit_code=97)
* 13:17 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster eqiad and group A
* 13:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1258.eqiad.wmnet with OS trixie
* 13:17 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:15 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] (duration: 07m 55s)
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:15 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1013.eqiad.wmnet
* 13:14 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:14 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1012.eqiad.wmnet
* 13:13 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 13:12 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 13:11 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:09 sbisson@deploy1003: sbisson: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 13:07 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1289446{{!}}Log editing_start and article_saved events for control group (T422146)]]
* 13:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 13:04 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1012.eqiad.wmnet with OS trixie
* 13:03 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 13:02 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-8].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 13:01 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] (duration: 09m 31s)
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1012.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:01 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1011.eqiad.wmnet
* 13:00 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:59 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:58 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:58 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 12:58 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 12:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 12:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 12:57 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 12:56 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1289,1291-1293].eqiad.wmnet
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:55 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:54 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:53 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1258.eqiad.wmnet with reason: host reimage
* 12:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 12:52 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289902{{!}}hCaptcha: Exempt Wikibase entity namespaces from edit/create triggers (T426829)]]
* 12:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1011.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1010.eqiad.wmnet
* 12:48 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[2183-2184].codfw.wmnet with reason: restart
* 12:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1285-1288].eqiad.wmnet
* 12:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:44 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 12:43 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1010.eqiad.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1009.eqiad.wmnet
* 12:42 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:37 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 12:36 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db1258.eqiad.wmnet with OS trixie
* 12:36 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:35 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1281-1284].eqiad.wmnet
* 12:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:33 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:33 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 12:31 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 12:31 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 12:26 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 12:26 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] (duration: 08m 37s)
* 12:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 12:25 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:22 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:22 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1277-1280].eqiad.wmnet
* 12:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:19 kharlan@deploy1003: kharlan: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:17 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1289938{{!}}Revert "ApiEditPage: Update request in main context before calling attemptSave()" (T426751)]]
* 12:17 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1011.eqiad.wmnet with reason: host reimage
* 12:16 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 12:15 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1273-1276].eqiad.wmnet
* 12:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:09 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2162: Migration of db2162.codfw.wmnet completed
* 12:06 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1009.eqiad.wmnet
* 12:05 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1008.eqiad.wmnet
* 12:04 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 12:04 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 12:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:02 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1011.eqiad.wmnet with OS trixie
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1269-1272].eqiad.wmnet
* 12:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 12:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:59 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:59 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 11:58 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] (duration: 07m 04s)
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 11:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:55 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:55 cwilliams@cumin1003: END (FAIL) - Cookbook sre.mysql.major-upgrade (exit_code=99)
* 11:54 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:53 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 11:51 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:51 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1289928{{!}}Fix UserGroupManager::getUserAutopromoteGroups with interwiki users (T426832)]]
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1265-1268].eqiad.wmnet
* 11:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:50 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1008.eqiad.wmnet
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1007.eqiad.wmnet
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 11:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db1258: Upgrading db1258.eqiad.wmnet
* 11:43 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 11:42 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 11:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 11:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1016.eqiad.wmnet with OS trixie
* 11:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 11:39 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1261-1264].eqiad.wmnet
* 11:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 11:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 11:31 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:29 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1257-1260].eqiad.wmnet
* 11:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:25 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:24 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2162: Migration of db2162.codfw.wmnet completed
* 11:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:22 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2162.codfw.wmnet with OS trixie
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 11:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:20 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1055.eqiad.wmnet to cluster codfw and group A
* 11:19 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:19 jmm@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1253-1256].eqiad.wmnet
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:18 jmm@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
* 11:17 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 11:17 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1016.eqiad.wmnet with reason: host reimage
* 11:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 11:16 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1010.eqiad.wmnet with reason: host reimage
* 11:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1007.eqiad.wmnet
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:12 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1006.eqiad.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 11:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:06 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[7-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:06 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1249-1252].eqiad.wmnet
* 11:05 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 11:05 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7014.magru.wmnet
* 11:05 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1249-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2011.codfw.wmnet
* 11:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2011.codfw.wmnet
* 11:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
* 11:05 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 11:04 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 11:04 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 11:01 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:01 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-jumbo1010.eqiad.wmnet with OS trixie
* 11:01 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2011.codfw.wmnet
* 11:00 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2162.codfw.wmnet with reason: host reimage
* 10:59 moritzm: failover Ganeti cluster in codfw to ganeti2048
* 10:57 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 10:56 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2011.codfw.wmnet
* 10:56 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2010.codfw.wmnet
* 10:55 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2010.codfw.wmnet
* 10:53 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1006.eqiad.wmnet
* 10:51 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1005.eqiad.wmnet
* 10:51 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:51 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2010.codfw.wmnet
* 10:46 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2009.codfw.wmnet
* 10:46 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2009.codfw.wmnet
* 10:44 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:42 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2162.codfw.wmnet with OS trixie
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 10:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2009.codfw.wmnet
* 10:40 fnegri@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb1015.eqiad.wmnet with reason: host reimage
* 10:40 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2162: Upgrading db2162.codfw.wmnet
* 10:39 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2009.codfw.wmnet
* 10:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2008.codfw.wmnet
* 10:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2008.codfw.wmnet
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 10:32 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2242: Migration of db2242.codfw.wmnet completed
* 10:31 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2008.codfw.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1005.eqiad.wmnet
* 10:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:30 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1004.eqiad.wmnet
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 10:27 fnegri@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1015.eqiad.wmnet with OS trixie
* 10:24 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 slyngshede@dns1004: END - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1004.eqiad.wmnet
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:23 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1003.eqiad.wmnet
* 10:22 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 10:21 slyngshede@dns1004: START - running authdns-update
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2008.codfw.wmnet
* 10:20 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2007.codfw.wmnet
* 10:20 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2007.codfw.wmnet
* 10:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 10:14 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2007.codfw.wmnet
* 10:13 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-wdqs-test1001.eqiad.wmnet with reason: host reimage
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 10:12 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1003.eqiad.wmnet
* 10:11 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:11 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1002.eqiad.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2007.codfw.wmnet
* 10:09 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2006.codfw.wmnet
* 10:09 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2006.codfw.wmnet
* 10:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 10:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dborch1002.wikimedia.org with reason: Reboot
* 10:04 btullis@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1002.eqiad.wmnet
* 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:03 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host dse-k8s-worker1001.eqiad.wmnet
* 10:02 fnegri@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1015.eqiad.wmnet with OS trixie
* 10:02 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2006.codfw.wmnet
* 09:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T415165|T415165]]
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2006.codfw.wmnet
* 09:57 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2005.codfw.wmnet
* 09:57 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2005.codfw.wmnet
* 09:56 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 09:56 btullis@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:52 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2005.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 09:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2005.codfw.wmnet
* 09:47 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2004.codfw.wmnet
* 09:47 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2004.codfw.wmnet
* 09:46 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2242: Migration of db2242.codfw.wmnet completed
* 09:44 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:43 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2242.codfw.wmnet with OS trixie
* 09:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2004.codfw.wmnet
* 09:41 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 09:37 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:37 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2004.codfw.wmnet
* 09:36 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2003.codfw.wmnet
* 09:36 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2003.codfw.wmnet
* 09:33 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:33 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 09:32 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2003.codfw.wmnet
* 09:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2031.codfw.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2031.codfw.wmnet
* 09:30 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 09:30 klausman@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 09:28 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 09:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 09:28 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2003.codfw.wmnet
* 09:27 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2002.codfw.wmnet
* 09:27 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2002.codfw.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1001.eqiad.wmnet
* 09:26 btullis@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:dse-k8s-worker-eqiad
* 09:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 09:24 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:23 klausman@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 09:19 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2242.codfw.wmnet with reason: host reimage
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 09:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 09:18 moritzm: temporarily drop ganeti2030 from the codfw cluster [[phab:T426199|T426199]]
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 09:15 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2002.codfw.wmnet
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 09:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: Migration of db2187.codfw.wmnet completed
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 09:09 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5004.wikimedia.org
* 09:07 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 12389
* 09:06 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 12389
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 207947
* 09:05 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2002.codfw.wmnet
* 09:04 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 09:04 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve2001.codfw.wmnet
* 09:04 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve2001.codfw.wmnet
* 09:04 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 207947
* 09:04 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2242.codfw.wmnet with OS trixie
* 09:03 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2242: Upgrading db2242.codfw.wmnet
* 09:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2242: Upgrading db2242.codfw.wmnet
* 09:01 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 09:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5004.wikimedia.org
* 08:59 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve2001.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4004.wikimedia.org
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:54 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve2001.codfw.wmnet
* 08:53 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 08:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4004.wikimedia.org
* 08:46 slyngshede@cumin1003: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 08:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 08:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 08:35 slyngshede@cumin1003: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 08:27 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2187: Migration of db2187.codfw.wmnet completed
* 08:26 moritzm: installing Java 11 security updates
* 08:26 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:25 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS trixie
* 08:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 08:24 moritzm: imported openjdk-8u492-ga-1~deb11u1 to component/jdk8 for bookworm (forward port of latest Java 8 security release)
* 08:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2006-2008].codfw.wmnet with reason: Reboot
* 08:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy[2005-2008].codfw.wmnet with reason: Reboot
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 08:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1029.eqiad.wmnet with reason: Reboot
* 08:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1026.eqiad.wmnet with reason: Reboot
* 08:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 08:07 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:02 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
* 08:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1023.eqiad.wmnet with reason: Reboot
* 07:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 07:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: Reboot
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 07:52 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 07:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 07:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1024.eqiad.wmnet with reason: Reboot
* 07:46 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 07:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 07:43 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS trixie
* 07:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1058.eqiad.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 07:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* 07:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 07:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
* 07:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1058.eqiad.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1057.eqiad.wmnet
* 07:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 07:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 07:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1249: Repooling after boot
* 07:27 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Upgrading db2187.codfw.wmnet
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1057.eqiad.wmnet
* 07:27 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 07:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 07:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 07:24 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] (duration: 06m 28s)
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1056.eqiad.wmnet
* 07:19 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:19 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 07:17 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289748{{!}}Fix wordmark dimensions]], [[gerrit:1289749{{!}}Fix wordmark dimensions]]
* 07:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1056.eqiad.wmnet
* 07:15 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] (duration: 10m 51s)
* 07:11 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1055.eqiad.wmnet
* 07:09 moritzm: remove haveged
* 07:06 mlitn@deploy1003: mlitn: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1055.eqiad.wmnet
* 07:04 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1289743{{!}}Squashed diff to master]], [[gerrit:1289744{{!}}Squashed diff to master]]
* 06:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 06:46 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db1249.eqiad.wmnet
* 06:46 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for db1249.eqiad.wmnet
* 06:45 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1249: Repooling after boot
* 06:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 06:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1011.eqiad.wmnet
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:35 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:34 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 moritzm: failover Ganeti cluster in drmrs02 to ganeti6002
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 06:31 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc1011.eqiad.wmnet
* 06:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 06:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet with reason: restart
* 05:41 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc1011 from dbctl [[phab:T426806|T426806]]', diff saved to https://phabricator.wikimedia.org/P92647 and previous config saved to /var/cache/conftool/dbconfig/20260520-054146-marostegui.json
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1012.eqiad.wmnet: Maintenance on pc2
* 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2022.codfw.wmnet,pc[1012,1022].eqiad.wmnet with reason: Maintenance on pc1
* 02:18 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:18 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:17 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 01:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 01:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp7011.magru.wmnet
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1015.eqiad.wmnet with OS trixie
* 01:30 pt1979@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 01:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir7003.*
* 01:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp7011.magru.wmnet
* 01:23 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp7011.*
* 01:10 pt1979@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin1003"
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] (duration: 08m 10s)
* 00:54 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1015.eqiad.wmnet with reason: host reimage
* 00:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1288274{{!}}404.php: Force a redirect to /wiki/ in very obvious cases (T129433)]]
* 00:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] (duration: 06m 33s)
* 00:39 ladsgroup@deploy1003: jdlrobson, ladsgroup: Continuing with deployment
* 00:38 ladsgroup@deploy1003: jdlrobson, ladsgroup: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:36 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1287441{{!}}Limit $wgThumbLimits to three options (T426328)]]
* 00:05 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
== 2026-05-19 ==
* 23:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2+icu72u1 into component/php83-icu72
* 22:39 brett: disabling pybal/puppet on lvs2012 due to hardware misconfiguration/failure - [[phab:T425890|T425890]]
* 22:18 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on lvs2012.codfw.wmnet with reason: MD RAID failure
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2012.codfw.wmnet
* 22:16 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2012.codfw.wmnet
* 21:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:46 jiji@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 21:42 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:41 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:34 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:27 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:23 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2014.codfw.wmnet
* 21:19 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1037
* 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:18 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1037
* 21:18 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:16 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host wdqs1036
* 21:16 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1036
* 21:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 21:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1038.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:10 cdanis: 💔cdanis@apt1002.wikimedia.org ~ 🕔🍺 sudo -i reprepro -C main --ignore=wrongdistribution copy bookworm-wikimedia trixie-wikimedia cidergrinder
* 21:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1037.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:09 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1036.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:08 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wdqs1036 to eqiad - jclark@cumin1003"
* 21:04 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 20:55 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] (duration: 06m 40s)
* 20:51 sbassett@deploy1003: sbassett: Continuing with deployment
* 20:51 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:51 sbassett@deploy1003: sbassett: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:49 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1288999{{!}}Explicitly set wgCSPUseReportURIDirective and not wmgCSPUseReportURIDirective to true (T424058)]]
* 20:47 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 20:40 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] (duration: 08m 12s)
* 20:36 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 20:35 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:35 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 20:34 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1288983{{!}}Revert^2 "Include xff in search logs"]]
* 20:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 20:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1245-1248].eqiad.wmnet
* 20:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:20 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:16 tgr@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] (duration: 08m 25s)
* 20:13 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:12 tgr@deploy1003: cscott, tgr: Continuing with deployment
* 20:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1241-1244].eqiad.wmnet
* 20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 20:10 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:10 tgr@deploy1003: cscott, tgr: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:08 tgr@deploy1003: Started scap sync-world: Backport for [[gerrit:1288953{{!}}Add CommonsFinder to $wgUrlProtocols (T426614)]], [[gerrit:1289352{{!}}Remove unused ParsoidFragmentInput and ParsoidFragmentSupport]]
* 20:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 20:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 20:03 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:02 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1163-1165,1240].eqiad.wmnet
* 20:01 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 20:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:57 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] (duration: 16m 15s)
* 19:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 19:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 19:53 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 19:53 otto@deploy1003: otto: Continuing with deployment
* 19:51 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1159-1162].eqiad.wmnet
* 19:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 19:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 19:44 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:43 otto@deploy1003: otto: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1155-1158].eqiad.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[3-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:42 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:41 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d] (duration: 02m 07s)
* 19:41 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp701[1-2].magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:41 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1289408{{!}}BugFix: Emit page_change at version 1.6.0 to pick up user wiki_id (T426198)]]
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:39 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (thin): Redeploy v0.3.14 THIN [analytics/refinery@eeef7f3d]
* 19:39 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d] (duration: 04m 35s)
* 19:34 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3]: Redeploy v0.3.14 [analytics/refinery@eeef7f3d]
* 19:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 19:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1151-1154].eqiad.wmnet
* 19:30 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:30 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:28 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d] (duration: 02m 00s)
* 19:26 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): ReHotfix Hadoop-test analytics/refinery@eeef7f3d]
* 19:25 mutante: rebooting gitlab-replica-a.wikimedia.org
* 19:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 19:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 19:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1147-1150].eqiad.wmnet
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7002.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7010.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1143-1146].eqiad.wmnet
* 19:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 19:06 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 19:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:03 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and not A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1139-1142].eqiad.wmnet
* 19:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 19:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:55 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 18:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1135-1138].eqiad.wmnet
* 18:50 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:50 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 18:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1131-1134].eqiad.wmnet
* 18:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:40 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1127-1130].eqiad.wmnet
* 18:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:24 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 18:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1123-1126].eqiad.wmnet
* 18:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:12 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7001.magru.wmnet,cp7009.magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:08 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:07 cjd91: import gdnsd_3.99.0-alpha3~deb13u1 into trixie-wikimedia-[[phab:T401832|T401832]]
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1119-1122].eqiad.wmnet
* 18:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:05 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1115-1118].eqiad.wmnet
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 17:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1115-1118].eqiad.wmnet
* 17:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:51 joal@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Hotfix Hadoop-test [analytics/refinery@eeef7f3d]
* 17:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1094-1095,1113-1114].eqiad.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:35 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 14 hosts with reason: restart
* 17:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1085-1087,1093].eqiad.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:33 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:30 tchin@deploy1003: Finished deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d] (duration: 00m 46s)
* 17:30 tchin@deploy1003: Started deploy [analytics/refinery@eeef7f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eeef7f3d]
* 17:28 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 17:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 17:27 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:26 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1079-1081,1084].eqiad.wmnet
* 17:23 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:22 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 17:14 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:13 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1075-1078].eqiad.wmnet
* 17:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:12 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:03 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 17:03 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1071-1074].eqiad.wmnet
* 17:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1067-1070].eqiad.wmnet
* 17:02 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[2004-2007].codfw.wmnet with reason: restart
* 16:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:54 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backupmon1001.eqiad.wmnet with reason: restart
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1067-1070].eqiad.wmnet
* 16:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:46 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1057,1064-1066].eqiad.wmnet
* 16:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1053-1056].eqiad.wmnet
* 16:32 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:32 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:28 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbprov[1004-1007].eqiad.wmnet with reason: restart
* 16:24 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1049-1052].eqiad.wmnet
* 16:21 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:21 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1045-1048].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1041-1044].eqiad.wmnet
* 16:07 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-wdqs-test1001.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 16:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1041-1044].eqiad.wmnet
* 15:59 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:59 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 15:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 15:53 moritzm: temporarily drop ganeti2029 from the codfw cluster [[phab:T426199|T426199]]
* 15:52 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 15:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 15:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 15:49 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2243: Migration of db2243.codfw.wmnet completed
* 15:49 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir and A:ncredir
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1037-1040].eqiad.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:45 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 15:43 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 15:39 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 15:39 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1021,1034-1036].eqiad.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:36 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:35 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 15:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 15:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 15:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 15:30 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 15:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 15:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 15:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 15:28 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1006-1007,1015-1016].eqiad.wmnet
* 15:27 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1006-1007,1015-1016,1021,1034-1057,1064-1081,1084-1087,1093-1095,1113-1165,1240-1289,1291-1327,1375-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:27 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 15:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 15:26 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 15:25 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1070.eqiad.wmnet
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 15:23 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 15:23 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:22 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2002.codfw.wmnet
* 15:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:22 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 15:21 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 15:21 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:21 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:20 swfrench-wmf: reprepro include php8.3_8.3.31-1+wmf11u2 into component/php83
* 15:20 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2002.codfw.wmnet
* 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 15:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 15:19 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2003.codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1055.eqiad.wmnet
* 15:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1055.eqiad.wmnet
* 15:17 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2003.codfw.wmnet
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 15:16 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]] (duration: 00m 46s)
* 15:16 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd2001.codfw.wmnet
* 15:16 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
* 15:16 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab1004 for [[phab:T426754|T426754]]
* 15:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 15:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 15:14 brennen@deploy1003: Finished deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]] (duration: 00m 49s)
* 15:14 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: restart
* 15:14 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
* 15:14 brennen@deploy1003: Started deploy [phabricator/deployment@463a948]: deploy phab2002 for [[phab:T426754|T426754]]
* 15:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 15:13 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 15:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 15:12 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd2001.codfw.wmnet
* 15:11 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
* 15:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:08 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 15:08 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 15:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 15:06 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 15:06 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 15:06 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 15:06 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] (duration: 08m 03s)
* 15:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1003.eqiad.wmnet
* 15:04 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl1002.eqiad.wmnet
* 15:04 cwilliams@cumin1003: START - Cookbook sre.mysql.pool pool db2243: Migration of db2243.codfw.wmnet completed
* 15:02 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 15:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2243.codfw.wmnet with OS trixie
* 15:01 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 15:00 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 15:00 moritzm: failover Ganeti cluster in drmrs01 to ganeti6001
* 15:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 14:59 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl1002.eqiad.wmnet
* 14:59 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
* 14:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 14:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 14:57 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289368{{!}}Enable hCaptcha for wikitext editor on group1 minus meta and itwiki (T425354)]]
* 14:57 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:56 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
* 14:55 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:55 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 14:51 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 14:51 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 09m 37s)
* 14:51 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2003.codfw.wmnet
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:47 dreamyjazz@deploy1003: kharlan, dreamyjazz: Continuing with deployment
* 14:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 14:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-ctrl2002.codfw.wmnet
* 14:45 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
* 14:44 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 14:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
* 14:44 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
* 14:44 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 14:44 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:43 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 14:43 dreamyjazz@deploy1003: kharlan, dreamyjazz: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS trixie
* 14:42 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:41 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:41 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:40 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] (duration: 07m 16s)
* 14:40 dreamyjazz@deploy1003: dreamyjazz, kharlan: Rolling back deployment
* 14:40 sukhe@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 14:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 14:39 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 14:39 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
* 14:38 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
* 14:38 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 14:37 dreamyjazz@deploy1003: dreamyjazz, kharlan: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). C
* 14:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 14:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 14:36 cwilliams@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2243.codfw.wmnet with reason: host reimage
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc1 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92606 and previous config saved to /var/cache/conftool/dbconfig/20260519-143632-marostegui.json
* 14:35 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc1021 master of pc1 in eqiad [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92605 and previous config saved to /var/cache/conftool/dbconfig/20260519-143549-marostegui.json
* 14:35 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:34 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 14:33 kamila@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host deploy2002.codfw.wmnet
* 14:33 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289348{{!}}hCaptcha: Enable for group1 wikis (except itwiki, metawiki) (T425354)]], [[gerrit:1289340{{!}}Drop unused $wgIPInfoIpoidUrl definition]], [[gerrit:1289339{{!}}Drop unused $wgWikimediaEventsIPoidUrl definition]], [[gerrit:1289337{{!}}Drop wgCheckUserDisplayClientHints definition]]
* 14:33 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 14:33 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
* 14:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
* 14:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 14:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 14:31 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:30 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:29 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 14:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 14:27 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
* 14:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
* 14:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
* 14:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 14:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 14:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 14:21 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
* 14:20 cwilliams@cumin1003: START - Cookbook sre.hosts.reimage for host db2243.codfw.wmnet with OS trixie
* 14:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1204.eqiad.wmnet with OS trixie
* 14:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 14:17 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2243: Upgrading db2243.codfw.wmnet
* 14:16 cwilliams@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
* 14:16 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2005.codfw.wmnet
* 14:16 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2005.codfw.wmnet
* 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2005.codfw.wmnet
* 14:11 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2004.codfw.wmnet
* 14:11 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2004.codfw.wmnet
* 14:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 14:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 14:07 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 14:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2004.codfw.wmnet
* 14:07 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 14:07 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2004.codfw.wmnet
* 14:06 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2003.codfw.wmnet
* 14:06 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2003.codfw.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 14:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 14:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2003.codfw.wmnet
* 14:02 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2002.codfw.wmnet
* 14:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2002.codfw.wmnet
* 14:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:59 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db[1204-1205].eqiad.wmnet with reason: restart/reimage
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 13:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 13:57 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2002.codfw.wmnet
* 13:57 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
* 13:55 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 13:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2029.codfw.wmnet
* 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
* 13:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 13:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 13:46 root@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 13:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 13:42 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 13:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 13:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 13:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 13:37 Lucas_WMDE: UTC afternoon backport+config window done
* 13:36 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1204.eqiad.wmnet with reason: host reimage
* 13:36 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] (duration: 12m 56s)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki1001.eqiad.wmnet
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:36 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:35 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - elukey@cumin1003"
* 13:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 13:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 13:32 cscott@deploy1003: cscott: Continuing with deployment
* 13:31 elukey@cumin1003: START - Cookbook sre.dns.netbox
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 13:29 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 13:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 13:28 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 13:28 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox and A:magru and (A:dnsbox)
* 13:28 sukhe@cumin1003: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on A:hcaptcha-proxy and A:hcaptcha-proxy
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:27 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 13:25 elukey@cumin1003: START - Cookbook sre.hosts.decommission for hosts pki1001.eqiad.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:25 cscott@deploy1003: cscott: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 13:23 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1289070{{!}}Forward-compatibility for serialization of ContentHolder in ParserOutput (T423701)]], [[gerrit:1289071{{!}}ParsoidLanguageConverter: don't convert TOC if __NOCONTENTCONVERT__ (T424773)]]
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 13:22 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 13:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 13:21 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1204.eqiad.wmnet with OS trixie
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 13:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 13:16 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] (duration: 07m 00s)
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 13:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 13:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 13:14 cezmunsta: Removing db2143 from orchestrator [[phab:T424171|T424171]]
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2143.codfw.wmnet
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2143.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 13:13 cezmunsta: Removing db2143 from zarcillo [[phab:T424171|T424171]]
* 13:13 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS trixie
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:12 dbrant@deploy1003: dbrant: Continuing with deployment
* 13:11 moritzm: failover Ganeti cluster in esams to ganeti3005
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 13:11 dbrant@deploy1003: dbrant: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 13:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS trixie
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 13:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 13:09 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1289328{{!}}Add "get_login_creds" permission to Android app for auth domain. (T426010)]]
* 13:08 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:52 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS trixie
* 12:50 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2027.codfw.wmnet
* 12:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 12:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 12:47 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2149.codfw.wmnet
* 12:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2027.codfw.wmnet
* 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 12:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 12:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 12:36 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:36 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2008.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS trixie
* 12:35 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2001.codfw.wmnet
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS trixie
* 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 12:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 12:33 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 12:33 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3005.esams.wmnet
* 12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 12:32 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS trixie
* 12:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS trixie
* 12:28 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2008.codfw.wmnet
* 12:28 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet
* 12:27 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2001.codfw.wmnet
* 12:27 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 12:26 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 12:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 12:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS trixie
* 12:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 12:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS trixie
* 12:20 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1279.eqiad.wmnet with OS trixie
* 12:19 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:19 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 12:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 12:18 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
* 12:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus7002.magru.wmnet
* 12:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS trixie
* 12:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 12:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 12:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 12:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 12:12 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 12:12 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus7002.magru.wmnet
* 12:12 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus6002.drmrs.wmnet
* 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 12:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS trixie
* 12:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 12:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 12:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 12:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2005.codfw.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS trixie
* 12:06 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus6002.drmrs.wmnet
* 12:06 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2007.codfw.wmnet
* 12:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS trixie
* 12:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 12:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2005.codfw.wmnet
* 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2006.codfw.wmnet
* 12:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 11:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:57 taavi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:56 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 11:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be2006.codfw.wmnet
* 11:56 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2007.codfw.wmnet
* 11:56 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
* 11:53 taavi@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 11:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 18 hosts with reason: restart
* 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 11:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 11:47 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 11:46 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
* 11:46 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus5003.eqsin.wmnet
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 11:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 11:42 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 11:39 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus5003.eqsin.wmnet
* 11:39 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1008.eqiad.wmnet
* 11:39 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 11:37 moritzm: failover Ganeti cluster in eqsin to ganeti5004
* 11:37 moritzm: failover Ganeti cluster in magru to ganeti7001
* 11:36 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS trixie
* 11:35 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS trixie
* 11:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS trixie
* 11:34 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS trixie
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS trixie
* 11:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS trixie
* 11:31 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS trixie
* 11:29 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1008.eqiad.wmnet
* 11:29 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 11:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS trixie
* 11:21 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS trixie
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 11:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:19 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
* 11:19 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus4003.ulsfo.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS trixie
* 11:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS trixie
* 11:13 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus4003.ulsfo.wmnet
* 11:13 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3004.esams.wmnet
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 11:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 11:10 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS trixie
* 11:07 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus3004.esams.wmnet
* 11:07 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1007.eqiad.wmnet
* 11:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 11:06 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 11:05 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS trixie
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 11:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS trixie
* 11:00 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1002.eqiad.wmnet
* 10:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:59 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1007.eqiad.wmnet
* 10:59 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
* 10:57 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS trixie
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS trixie
* 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 10:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:53 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1002.eqiad.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS trixie
* 10:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:50 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:50 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1005.eqiad.wmnet
* 10:49 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
* 10:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS trixie
* 10:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:46 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1005.eqiad.wmnet
* 10:45 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1004.eqiad.wmnet
* 10:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 10:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
* 10:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 10:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1004.eqiad.wmnet
* 10:40 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:40 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1003.eqiad.wmnet
* 10:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 10:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1006.eqiad.wmnet
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1003.eqiad.wmnet
* 10:36 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:36 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1002.eqiad.wmnet
* 10:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 10:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 10:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 10:32 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1006.eqiad.wmnet
* 10:31 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 10:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1002.eqiad.wmnet
* 10:26 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on D<nowiki>{</nowiki>aux-k8s-worker100[2-5].eqiad.wmnet<nowiki>}</nowiki> and (A:aux-master-eqiad or A:aux-worker-eqiad)
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS trixie
* 10:24 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS trixie
* 10:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS trixie
* 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS trixie
* 10:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS trixie
* 10:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS trixie
* 10:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS trixie
* 10:18 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan2002.codfw.wmnet
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS trixie
* 10:10 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan2002.codfw.wmnet
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 10:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2244: Migration of db2244.codfw.wmnet completed
* 10:07 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 10:05 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host titan1001.eqiad.wmnet
* 10:00 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 10:00 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 09:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 09:58 tappof@cumin1003: START - Cookbook sre.hosts.reboot-single for host titan1001.eqiad.wmnet
* 09:55 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 09:53 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 09:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 09:47 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 09:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 09:42 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 09:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 40401
* 09:38 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 40401
* 09:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps-test2001.codfw.wmnet
* 09:34 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 09:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1006.eqiad.wmnet
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
* 09:31 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:31 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps-test2001.codfw.wmnet
* 09:26 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
* 09:25 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:25 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 09:23 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2244: Migration of db2244.codfw.wmnet completed
* 09:20 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
* 09:20 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2244.codfw.wmnet with OS trixie
* 09:20 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
* 09:19 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:19 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
* 09:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:18 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
* 09:13 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 09:13 filippo@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host cloudnet1006.eqiad.wmnet
* 09:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet1006.eqiad.wmnet
* 09:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:07 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
* 09:05 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 09:05 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 09:04 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 09:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 09:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2244.codfw.wmnet with reason: host reimage
* 08:59 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 08:58 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 08:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 08:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 08:55 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] (duration: 07m 15s)
* 08:52 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 08:51 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 08:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 08:50 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:49 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 08:49 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 08:49 elukey@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
* 08:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1072.eqiad.wmnet
* 08:48 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1289275{{!}}Remove unused $wgEnableUserEmailMuteList config (T413867)]]
* 08:47 moritzm: failover Ganeti cluster in ulsfo to ganeti4008
* 08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 08:44 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] (duration: 09m 08s)
* 08:44 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 08:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2244.codfw.wmnet with OS trixie
* 08:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:43 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1072.eqiad.wmnet
* 08:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1071.eqiad.wmnet
* 08:42 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 08:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 08:41 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 08:40 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2244: Upgrading db2244.codfw.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 08:39 marostegui@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 08:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1071.eqiad.wmnet
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1070.eqiad.wmnet
* 08:37 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 08:37 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 08:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:35 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 08:35 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286804{{!}}IPReputation: Route opensearch_ipoid through envoy service mesh (T421293)]]
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:35 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 08:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 08:33 cezmunsta: Removing db2151 from orchestrator [[phab:T424343|T424343]]
* 08:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 08:32 cezmunsta: Removing db2151 from zarcillo [[phab:T424343|T424343]]
* 08:32 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc1070.eqiad.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2151.codfw.wmnet
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:31 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2151.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 08:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 08:28 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 08:27 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 08:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:26 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 08:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 08:24 Emperor: reboot apus codfw frontends (May reboots)
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:22 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2151.codfw.wmnet
* 08:19 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 08:19 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 08:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 08:17 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 08:17 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:13 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4008.ulsfo.wmnet
* 08:10 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2212: Repooling after switchover
* 07:58 cezmunsta: Removing db2150 from orchestrator [[phab:T424342|T424342]]
* 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 07:57 Emperor: reboot apus eqiad frontends (May reboots)
* 07:52 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 07:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 07:50 cezmunsta: Removing db2150 from zarcillo [[phab:T424342|T424342]]
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2150.codfw.wmnet
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:48 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:48 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2150.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 07:46 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 07:45 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 07:43 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 07:39 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 07:39 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2150.codfw.wmnet
* 07:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
* 07:33 XioNoX: add gnmic 0.46.0 to reprepro
* 07:20 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] (duration: 13m 17s)
* 07:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:17 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2212: Repooling after switchover
* 07:14 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 07:14 mlitn@deploy1003: mlitn: Continuing with deployment
* 07:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1210: Repooling after switchover
* 07:13 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288994{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 07:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:07 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288994{{!}}Squashed diff to master]]
* 07:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 07:04 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 07:02 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 06:59 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
* 06:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 06:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 06:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db2212 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92584 and previous config saved to /var/cache/conftool/dbconfig/20260519-065637-fceratto.json
* 06:54 moritzm: installing qemu security updates
* 06:52 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2203 to s1 primary [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92583 and previous config saved to /var/cache/conftool/dbconfig/20260519-065224-fceratto.json
* 06:52 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2021.codfw.wmnet,pc[1011,1021].eqiad.wmnet with reason: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:51 federico3: Starting s1 codfw failover from db2212 to db2203 - [[phab:T426703|T426703]]
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011.eqiad.wmnet: Maintenance on pc1
* 06:50 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 06:45 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2203 with weight 0 [[phab:T426703|T426703]]', diff saved to https://phabricator.wikimedia.org/P92581 and previous config saved to /var/cache/conftool/dbconfig/20260519-064500-fceratto.json
* 06:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 30 hosts with reason: Primary switchover s1 [[phab:T426703|T426703]]
* 06:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 06:40 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 06:39 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2014.codfw.wmnet
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:33 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2014.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:29 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
* 06:28 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1210: Repooling after switchover
* 06:28 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
* 06:24 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2014.codfw.wmnet
* 06:22 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2014 from dbctl [[phab:T426595|T426595]]', diff saved to https://phabricator.wikimedia.org/P92578 and previous config saved to /var/cache/conftool/dbconfig/20260519-062227-marostegui.json
* 06:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 06:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depool db1210 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92577 and previous config saved to /var/cache/conftool/dbconfig/20260519-062056-fceratto.json
* 06:19 fceratto@dns1005: END - running authdns-update
* 06:18 fceratto@dns1005: START - running authdns-update
* 06:15 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db1230 to s5 primary and set section read-write [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92576 and previous config saved to /var/cache/conftool/dbconfig/20260519-061524-fceratto.json
* 06:14 fceratto@cumin1003: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92575 and previous config saved to /var/cache/conftool/dbconfig/20260519-061435-fceratto.json
* 06:14 federico3: Starting s5 eqiad failover from db1210 to db1230 - [[phab:T426087|T426087]]
* 06:09 fceratto@cumin1003: dbctl commit (dc=all): 'Set db1230 with weight 0 [[phab:T426087|T426087]]', diff saved to https://phabricator.wikimedia.org/P92574 and previous config saved to /var/cache/conftool/dbconfig/20260519-060929-fceratto.json
* 06:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426087|T426087]]
* 05:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2014.codfw.wmnet with reason: Maintenance on pc4
* 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.26 (duration: 02m 40s)
* 03:41 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]] (duration: 38m 23s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.3 refs [[phab:T423912|T423912]]
* 02:08 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 02:00 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:00 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:58 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] (duration: 06m 36s)
* 00:54 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:54 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289008{{!}}ThumbLimits: Harmonize svwiki large size with the rest of wikis (T376152)]]
* 00:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 00:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] (duration: 07m 08s)
* 00:26 ladsgroup@deploy1003: ladsgroup, jforrester: Continuing with deployment
* 00:25 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 00:25 ladsgroup@deploy1003: ladsgroup, jforrester: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 00:23 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289005{{!}}IS: Drop wgGraphDefaultVegaVer, never used any more]], [[gerrit:1289006{{!}}IS: Drop wgEnableSpecialMute, ignored since MW 1.46]], [[gerrit:1289007{{!}}IS: Drop wgDiscussionTools_visualenhancements_*, ignored since 2025]]
== 2026-05-18 ==
* 23:48 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 23:44 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on vrts1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 23:37 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lists2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 23:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
* 23:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
* 23:18 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] (duration: 06m 52s)
* 23:13 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 23:12 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:11 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1289000{{!}}Remove wgThumbnailStepsRatio]]
* 23:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 23:03 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 23:00 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 22:59 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
* 22:48 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
* 22:11 swfrench@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker2331.codfw.wmnet
* 22:06 swfrench@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker2331.codfw.wmnet
* 21:55 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:54 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2154].codfw.wmnet
* 21:45 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:44 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] (duration: 11m 29s)
* 21:38 krinkle@deploy1003: seddon, krinkle: Continuing with deployment
* 21:32 krinkle@deploy1003: seddon, krinkle: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:31 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1288925{{!}}Revert "Enable wgTrackMediaRequestProvenance on Commons" (T414338 T425580)]]
* 21:28 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2190.codfw.wmnet
* 21:28 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2190.codfw.wmnet
* 21:16 mutante: gerrit-replica.wikimedia.org back online
* 21:14 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:15:00 on gerrit-replica.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit2002.wikimedia.org with reason: [[phab:T426563|T426563]]
* 21:10 mutante: gerrit-replica.wikimedia.org, gerrit-spare.wikimedia.org - rebooting backends
* 21:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gerrit1003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 20:30 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:30 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:14 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2317-2330].codfw.wmnet
* 20:13 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:13 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2303-2316].codfw.wmnet
* 20:05 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2303-2316].codfw.wmnet
* 19:56 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:56 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:48 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:40 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2289-2302].codfw.wmnet
* 19:39 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:39 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:29 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:21 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2275-2288].codfw.wmnet
* 19:20 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:20 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:12 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:03 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2261-2274].codfw.wmnet
* 19:02 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 19:02 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:54 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:48 jhathaway@dns1004: END - running authdns-update
* 18:46 jhathaway@dns1004: START - running authdns-update
* 18:45 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2243,2248-2260].codfw.wmnet
* 18:45 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:44 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:38 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab2003.wikimedia.org with reason: [[phab:T426563|T426563]]
* 18:35 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2203-2215,2242].codfw.wmnet
* 18:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:32 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab-runner2004.codfw.wmnet
* 18:31 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
* 18:30 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog1002.eqiad.wmnet
* 18:30 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog1002.eqiad.wmnet
* 18:26 herron: rebooting alert1002
* 18:26 swfrench@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2203-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:22 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:22 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2191-2202].codfw.wmnet
* 18:20 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
* 18:20 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
* 18:19 swfrench@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2189.codfw.wmnet
* 18:19 swfrench@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2189.codfw.wmnet
* 18:16 mutante: releases.wikimedia.org - rebooting backends
* 18:16 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on releases2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 18:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
* 18:13 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
* 18:13 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
* 18:11 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 18:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 18:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 18:06 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
* 18:06 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
* 18:04 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 18:02 Reedy: Deployed patch for [[phab:T426631|T426631]]
* 17:59 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
* 17:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
* 17:56 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 17:50 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host centrallog2002.codfw.wmnet
* 17:50 dzahn@cumin2002: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
* 17:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
* 17:46 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab1005.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:46 herron: rebooting alert2002
* 17:45 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on phab2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:45 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:45 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana1002.eqiad.wmnet
* 17:44 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host alert2002.wikimedia.org
* 17:44 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
* 17:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
* 17:40 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana1002.eqiad.wmnet
* 17:38 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet
* 17:37 mutante: stewards* - rebooting
* 17:36 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host grafana2001.codfw.wmnet
* 17:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:32 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host grafana2001.codfw.wmnet
* 17:31 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet
* 17:30 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2004.codfw.wmnet
* 17:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1003.eqiad.wmnet
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 17:23 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 17:23 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host graphite2004.codfw.wmnet
* 17:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp1001.eqiad.wmnet
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 17:18 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog1003.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp1001.eqiad.wmnet
* 17:16 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 17:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2003.codfw.wmnet
* 17:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc2003.codfw.wmnet with reason: [[phab:T426563|T426563]]
* 17:14 mutante: doc.wikimedia.org - rebooting backends
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
* 17:13 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest2001.codfw.wmnet
* 17:13 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:13 topranks: restarted gnmic on netflow3004 as series missing for cr2-esams
* 17:12 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host o11ytest1001.eqiad.wmnet
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
* 17:11 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
* 17:11 mutante: etherpad - rebooting backends
* 17:10 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on etherpad1004.eqiad.wmnet with reason: [[phab:T426563|T426563]]
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host arclamp2001.codfw.wmnet
* 17:10 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host mwlog2003.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest2001.codfw.wmnet
* 17:07 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host o11ytest1001.eqiad.wmnet
* 17:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 17:04 mutante: contint2002, phab2002 - rebooting
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:49 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:49 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:42 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1373-1374].eqiad.wmnet
* 16:41 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:41 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:37 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 16:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:32 mutante: zuul[12]00[123] / zuul* - rebooting
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2189-2202].codfw.wmnet
* 16:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1370-1372].eqiad.wmnet
* 16:28 mutante: contint2003 - new jenkins - reboot for kernel upgrade
* 16:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:28 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:27 mutante: people.wikimedia.org backend - rebooting
* 16:22 mutante: contint1003 - rebooting
* 16:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 16:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:20 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:19 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1367-1369].eqiad.wmnet
* 16:18 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:14 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 16:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2170-2179,2184-2188].codfw.wmnet
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:12 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
* 16:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1364-1366].eqiad.wmnet
* 16:09 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:09 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:05 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 16:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 16:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2155-2169].codfw.wmnet
* 16:02 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:01 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1361-1363].eqiad.wmnet
* 16:00 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 16:00 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 15:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2155-2169].codfw.wmnet
* 15:54 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2155-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:53 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:53 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 15:52 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1358-1360].eqiad.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:51 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:50 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 15:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 15:49 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 15:48 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 15:48 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2152-2154].codfw.wmnet
* 15:48 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 15:45 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1355-1357].eqiad.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:43 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:41 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 15:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:40 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 Amir1: re-mapping thumbsize of 1 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2149-2151].codfw.wmnet
* 15:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 15:36 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1352-1354].eqiad.wmnet
* 15:34 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:34 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 15:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2146-2148].codfw.wmnet
* 15:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:28 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1349-1351].eqiad.wmnet
* 15:26 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:26 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 15:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2143-2145].codfw.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:19 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 15:18 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1346-1348].eqiad.wmnet
* 15:17 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:17 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 15:12 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 15:11 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1343-1345].eqiad.wmnet
* 15:11 herron@cumin1003: END (FAIL) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=99) rolling reboot on A:kafka-logging-codfw
* 15:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2142].codfw.wmnet
* 15:11 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 15:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 15:05 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1340-1342].eqiad.wmnet
* 15:04 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:04 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1337-1339].eqiad.wmnet
* 15:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2137-2139].codfw.wmnet
* 15:01 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2134-2136].codfw.wmnet
* 15:01 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:57 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1337-1339].eqiad.wmnet
* 14:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:54 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2134-2136].codfw.wmnet
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:48 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:48 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1334-1336].eqiad.wmnet
* 14:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:47 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:47 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 Daimona: Running queries to fixup data for [[phab:T426002|T426002]]
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2131-2133].codfw.wmnet
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:43 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 14:40 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1331-1333].eqiad.wmnet
* 14:38 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:38 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2128-2130].codfw.wmnet
* 14:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:34 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:33 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2192 weight', diff saved to https://phabricator.wikimedia.org/P92573 and previous config saved to /var/cache/conftool/dbconfig/20260518-143320-fceratto.json
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2149.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:32 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2143.codfw.wmnet with reason: Depooled host, will be decommissioned
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1328-1330].eqiad.wmnet
* 14:31 jiji@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1384].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:30 bking@deploy1003: Finished deploy [wdqs/wdqs@e8fb00c]: 0.3.163 (duration: 31m 47s)
* 14:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
* 14:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2125-2127].codfw.wmnet
* 14:25 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
* 14:24 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:22 mlitn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] (duration: 20m 05s)
* 14:22 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:18 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:18 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2192: Repooling after switchover
* 14:18 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2192: Repooling after switchover
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2114-2115,2124].codfw.wmnet
* 14:16 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe
* 14:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:16 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2179: Repooling after switchover
* 14:16 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2179: Repooling after switchover
* 14:13 mlitn@deploy1003: Rolling back deployment
* 14:13 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:10 Amir1: mapping thumbsize of 0 to 2 in all group1 wikis ([[phab:T376152|T376152]])
* 14:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:09 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:08 mlitn@deploy1003: mlitn: Continuing with deployment
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 14:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 14:05 mlitn@deploy1003: mlitn: Backport for [[gerrit:1288504{{!}}Squashed diff to master]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2111-2113].codfw.wmnet
* 14:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2108-2110].codfw.wmnet
* 14:02 mlitn@deploy1003: Started scap sync-world: Backport for [[gerrit:1288504{{!}}Squashed diff to master]]
* 13:58 bking@deploy1003: Started deploy [wdqs/wdqs@e8fb00c]: 0.3.163
* 13:56 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] (duration: 09m 57s)
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
* 13:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2108-2110].codfw.wmnet
* 13:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:51 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Continuing with deployment
* 13:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 13:48 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
* 13:47 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde, dancy: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:46 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:46 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1287895{{!}}Store uncomputed references delta as null, not 0 (T426002)]], [[gerrit:1287026{{!}}.gitignore: Add /static/hcaptcha/ (T403829)]]
* 13:44 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2105-2107].codfw.wmnet
* 13:44 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:44 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad] (duration: 01m 55s)
* 13:42 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (thin): Regular analytics weekly train THIN [analytics/refinery@ba10fcad]
* 13:42 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:41 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad] (duration: 05m 05s)
* 13:41 Lucas_WMDE: updateCollation arwikisource for [[phab:T426526|T426526]] finished
* 13:41 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:40 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:36 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca]: Regular analytics weekly train [analytics/refinery@ba10fcad]
* 13:36 tchin@deploy1003: Finished deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad] (duration: 01m 54s)
* 13:34 tchin@deploy1003: Started deploy [analytics/refinery@ba10fca] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ba10fcad]
* 13:33 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
* 13:33 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
* 13:33 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: updateCollation arwikisource --previous-collation=uppercase # [[phab:T426526|T426526]]
* 13:33 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:33 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] (duration: 11m 24s)
* 13:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2102-2104].codfw.wmnet
* 13:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:30 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:29 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:28 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Continuing with deployment
* 13:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 13:25 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5003.eqsin.wmnet
* 13:23 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:23 lucaswerkmeister-wmde@deploy1003: hubaishan, lucaswerkmeister-wmde: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2093-2095].codfw.wmnet
* 13:21 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288042{{!}}[config] Set Category Collation for arwikisource (T426526)]]
* 13:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5003.eqsin.wmnet
* 13:21 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:21 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:19 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] (duration: 09m 18s)
* 13:18 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:16 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
* 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Continuing with deployment
* 13:14 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:13 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
* 13:13 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2090-2092].codfw.wmnet
* 13:12 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:12 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:11 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, sgimeno: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1288487{{!}}fix(signup.js): Do not warn about a username being available (T419401)]]
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 13:07 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] (duration: 11m 27s)
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2087-2089].codfw.wmnet
* 13:03 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078].codfw.wmnet
* 13:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 13:00 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:59 kharlan@deploy1003: kharlan: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:56 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1288494{{!}}hCaptcha: Drop addurl trigger and 100% passive mode SiteKey (T426587)]]
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078].codfw.wmnet
* 12:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92570 and previous config saved to /var/cache/conftool/dbconfig/20260518-125038-fceratto.json
* 12:50 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2073-2075].codfw.wmnet
* 12:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:41 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast7002.wikimedia.org
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92569 and previous config saved to /var/cache/conftool/dbconfig/20260518-124030-fceratto.json
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast7002.wikimedia.org
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:32 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2070-2072].codfw.wmnet
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet
* 12:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P92568 and previous config saved to /var/cache/conftool/dbconfig/20260518-123022-fceratto.json
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:22 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
* 12:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 12:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2067-2069].codfw.wmnet
* 12:20 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92567 and previous config saved to /var/cache/conftool/dbconfig/20260518-122014-fceratto.json
* 12:18 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
* 12:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 12:13 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2062,2064-2065].codfw.wmnet
* 12:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:11 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 12:07 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 12:04 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2059-2061].codfw.wmnet
* 12:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 Dreamy_Jazz: Ran `scap remove-patch --message-body 'Dropping patch already made public' /srv/patches/next/extensions/ConfirmEdit/01-[[phab:T423840|T423840]].patch`
* 12:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2056-2058].codfw.wmnet
* 12:02 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 11:58 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2056-2058].codfw.wmnet
* 11:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:53 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2213 to s5 primary [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92566 and previous config saved to /var/cache/conftool/dbconfig/20260518-115304-fceratto.json
* 11:52 federico3: Starting s5 codfw failover from db2192 to db2213 - [[phab:T426600|T426600]]
* 11:51 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:50 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 11:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:46 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2213 with weight 0 [[phab:T426600|T426600]]', diff saved to https://phabricator.wikimedia.org/P92565 and previous config saved to /var/cache/conftool/dbconfig/20260518-114652-fceratto.json
* 11:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s5 [[phab:T426600|T426600]]
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2050-2051,2055].codfw.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 11:41 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
* 11:39 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
* 11:38 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
* 11:38 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2044,2046,2049].codfw.wmnet
* 11:36 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:36 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:32 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bullseye
* 11:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bullseye
* 11:29 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:27 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2039,2041-2042].codfw.wmnet
* 11:26 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:26 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bullseye
* 11:24 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
* 11:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 11:21 slyngshede@dns1004: END - running authdns-update
* 11:19 slyngshede@dns1004: START - running authdns-update
* 11:19 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast4006.wikimedia.org
* 11:17 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2036-2038].codfw.wmnet
* 11:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast4006.wikimedia.org
* 11:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm2001.wikimedia.org
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 11:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:09 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm2001.wikimedia.org
* 11:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm-test1001.wikimedia.org
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2033-2035].codfw.wmnet
* 11:09 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:09 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
* 11:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
* 11:05 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idm-test1001.wikimedia.org
* 11:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 11:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp1005.wikimedia.org
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2016-2018].codfw.wmnet
* 11:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2013-2015].codfw.wmnet
* 11:00 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp1005.wikimedia.org
* 11:00 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:56 slyngshede@dns1004: END - running authdns-update
* 10:55 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bullseye
* 10:54 slyngshede@dns1004: START - running authdns-update
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bullseye
* 10:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bullseye
* 10:53 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2013-2015].codfw.wmnet
* 10:51 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:50 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp2005.wikimedia.org
* 10:46 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp2005.wikimedia.org
* 10:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 10:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2006,2011-2012].codfw.wmnet
* 10:42 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:42 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:41 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 10:41 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2005.wikimedia.org
* 10:37 slyngshede@cumin1003: START - Cookbook sre.hosts.reboot-single for host idp-test2005.wikimedia.org
* 10:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2001-2002,2005].codfw.wmnet
* 10:33 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:27 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:21 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc4', diff saved to https://phabricator.wikimedia.org/P92564 and previous config saved to /var/cache/conftool/dbconfig/20260518-101917-marostegui.json
* 10:18 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2024: replacing hw
* 10:18 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 10:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2024: replacing hw
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to pc4 master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92563 and previous config saved to /var/cache/conftool/dbconfig/20260518-101749-marostegui.json
* 10:17 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2024 to dbctl [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92562 and previous config saved to /var/cache/conftool/dbconfig/20260518-101714-marostegui.json
* 10:11 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92561 and previous config saved to /var/cache/conftool/dbconfig/20260518-100831-fceratto.json
* 10:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 10:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92560 and previous config saved to /var/cache/conftool/dbconfig/20260518-100710-fceratto.json
* 10:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2240 to s4 primary [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92559 and previous config saved to /var/cache/conftool/dbconfig/20260518-100203-fceratto.json
* 10:00 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 10:00 federico3: Starting s4 codfw failover from db2179 to db2240 - [[phab:T426590|T426590]]
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 09:57 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:57 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2240 with weight 0 [[phab:T426590|T426590]]', diff saved to https://phabricator.wikimedia.org/P92558 and previous config saved to /var/cache/conftool/dbconfig/20260518-095218-fceratto.json
* 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 40 hosts with reason: Primary switchover s4 [[phab:T426590|T426590]]
* 09:50 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 09:47 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:47 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1082.eqiad.wmnet with OS bullseye
* 09:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:41 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:38 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 09:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2374].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 09:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1086.eqiad.wmnet with reason: host reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:29 cezmunsta: Removing db2152.codfw.wmnet from orchestrator [[phab:T424344|T424344]]
* 09:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1086
* 09:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1086
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1086.eqiad.wmnet 18.32.64.10.in-addr.arpa 8.1.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1082.eqiad.wmnet with reason: host reimage
* 09:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1086 - mvernon@cumin2002"
* 09:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1086
* 09:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1086.eqiad.wmnet with OS bullseye
* 09:18 moritzm: installing Java 21 security updates
* 09:13 cezmunsta: Removing db2152.codfw.wmnet from zarcillo [[phab:T424344|T424344]]
* 09:11 jnuche@deploy1003: Installation of scap version "4.265.2" completed for 1 hosts
* 09:10 jnuche@deploy1003: Installing scap version "4.265.2" for 1 host(s)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2152.codfw.wmnet
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:08 cwilliams@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 cwilliams@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2152.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cwilliams@cumin1003"
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be1082
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1082
* 09:07 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1082
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be1082.eqiad.wmnet 52.32.64.10.in-addr.arpa 2.5.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:07 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be1082 - mvernon@cumin2002"
* 09:05 jnuche@deploy1003: Installing scap version "4.265.2" for 163 host(s)
* 09:04 cwilliams@cumin1003: START - Cookbook sre.dns.netbox
* 09:03 ayounsi@dns1004: END - running authdns-update
* 09:02 javiermonton@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] (duration: 31m 35s)
* 09:01 ayounsi@dns1004: START - running authdns-update
* 08:59 cwilliams@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2152.codfw.wmnet
* 08:50 javiermonton@deploy1003: javiermonton: Continuing with deployment
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 javiermonton@deploy1003: javiermonton: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:50 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 08:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 08:41 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be1082
* 08:40 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1082.eqiad.wmnet with OS bullseye
* 08:31 javiermonton@deploy1003: Started scap sync-world: Backport for [[gerrit:1287366{{!}}stream: mediawiki.page_html_content_change (T423920)]]
* 08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 08:15 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 08:14 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 08:12 moritzm: installing glibc bugfix updates from bookworm point release
* 07:46 moritzm: installing systemd bugfix updates from bookworm point release
* 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 07:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 07:35 moritzm: installing openssl bugfix updates from bookworm point release
* 07:25 godog: clean up space on cloudcumin1001: apt archives and older kernels
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2013.codfw.wmnet
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:14 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pc2013.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:09 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:04 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts pc2013.codfw.wmnet
* 07:03 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2013 from dbctl [[phab:T426555|T426555]]', diff saved to https://phabricator.wikimedia.org/P92557 and previous config saved to /var/cache/conftool/dbconfig/20260518-070322-marostegui.json
* 06:59 moritzm: installing systemd bugfix updates from trixie point release
* 06:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Kgraessle out of all services on: 2468 hosts
* 06:54 moritzm: installing Linux 6.1.172 on bookworm hosts
* 06:49 moritzm: installing glibc bugfix updates from trixie point release
* 06:44 moritzm: installing openssl bugfix updates from trixie point release
* 06:33 moritzm: installing Linux 6.12.88 on trixie hosts
* 05:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 05:11 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 05:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2014,2024].codfw.wmnet,pc1014.eqiad.wmnet with reason: Maintenance on pc4
== 2026-05-15 ==
* 21:03 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] (duration: 07m 43s)
* 20:59 jforrester@deploy1003: jforrester, seddon: Continuing with deployment
* 20:57 jforrester@deploy1003: jforrester, seddon: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:55 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1287940{{!}}Revert "Enable wgTrackMediaRequestProvenance on remaining Wikipedias" (T425580)]]
* 20:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1290.eqiad.wmnet with OS bookworm
* 20:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:09 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:47 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1290.eqiad.wmnet with reason: host reimage
* 19:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 19:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 19:21 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 19:21 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:53 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 16:02 dancy@deploy1003: Installation of scap version "4.265.1" completed for 2 hosts
* 16:00 dancy@deploy1003: Installing scap version "4.265.1" for 2 host(s)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove IPs that had been used for ulsfo cr links from dns - cmooney@cumin1003"
* 12:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2009.codfw.wmnet
* 11:59 Emperor: depool / restart swift / repool on ms-fe2010 ms-fe2012
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe2009.codfw.wmnet
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:34 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2065.codfw.wmnet with OS bullseye
* 11:14 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2065.codfw.wmnet with reason: host reimage
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2064.codfw.wmnet with OS bullseye
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:46 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2010.codfw.wmnet with OS trixie
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2065
* 10:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2065
* 10:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2065
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2065.codfw.wmnet 167.48.192.10.in-addr.arpa 7.6.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2065 - mvernon@cumin2002"
* 10:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2065
* 10:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2065.codfw.wmnet with OS bullseye
* 10:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:31 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:28 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 10:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:23 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:22 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 10:20 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2064.codfw.wmnet with reason: host reimage
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:12 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:12 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify entries for ulsfo router interfaces - cmooney@cumin1003"
* 10:10 topranks: Migrate ulsfo cr<->cr traffic to use path via switches not direct link [[phab:T424611|T424611]]
* 10:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 10:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 10:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 10:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 10:00 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:56 topranks: Migrate cr3-ulsfo link to asw1-22-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 09:48 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2064.codfw.wmnet with OS bullseye
* 09:32 topranks: Migrate cr4-ulsfo link to asw1-23-ulsfo to tagged interface [[phab:T424611|T424611]]
* 09:30 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 09:30 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 09:30 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2065
* 09:30 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 09:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on db2218.codfw.wmnet with reason: Host crashed [[phab:T426383|T426383]]
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2064
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2064
* 09:06 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2064
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2064.codfw.wmnet 56.32.192.10.in-addr.arpa 6.5.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:06 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2064 - mvernon@cumin2002"
* 09:03 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 09:02 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2064
* 09:01 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2064.codfw.wmnet with OS bullseye
* 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2218 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92553 and previous config saved to /var/cache/conftool/dbconfig/20260515-090000-marostegui.json
* 08:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2220 to s7 primary [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92552 and previous config saved to /var/cache/conftool/dbconfig/20260515-085836-marostegui.json
* 08:56 marostegui: Starting s7 codfw failover from db2218 to db2220 - [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426380|T426380]]
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2220 with weight 0 [[phab:T426380|T426380]]', diff saved to https://phabricator.wikimedia.org/P92551 and previous config saved to /var/cache/conftool/dbconfig/20260515-085420-marostegui.json
* 08:41 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2065
* 08:41 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2064
* 08:28 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 08:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:55 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2064
* 07:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 07:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 07:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
* 07:39 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
* 07:10 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:31 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 50s)
* 02:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1290.eqiad.wmnet with OS bookworm
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1289.eqiad.wmnet with OS bookworm
* 01:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1289.eqiad.wmnet with reason: host reimage
* 00:43 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:14 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1290.eqiad.wmnet with OS bookworm
* 00:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:01 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-14 ==
* 23:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:57 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 23:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 23:54 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:49 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:39 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:34 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:26 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:13 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 23:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 23:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:47 egardner@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] (duration: 07m 14s)
* 21:43 egardner@deploy1003: egardner: Continuing with deployment
* 21:41 egardner@deploy1003: egardner: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 egardner@deploy1003: Started scap sync-world: Backport for [[gerrit:1287488{{!}}Share Highlight: overdraw photo on share card canvas (T426344)]]
* 21:33 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] (duration: 09m 15s)
* 21:29 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 21:26 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:24 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1287485{{!}}Disable Reading Lists survey for Wikipedias (T421776)]]
* 21:16 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] (duration: 06m 33s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1286.eqiad.wmnet with OS bookworm
* 21:15 vriley@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:12 dreamyjazz@deploy1003: dreamyjazz, seddon: Continuing with deployment
* 21:11 dreamyjazz@deploy1003: dreamyjazz, seddon: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1287479{{!}}Enable hCaptcha for account creation API on group 0 wiki's]], [[gerrit:1287484{{!}}Remove DynamicPageList from legalteamwiki as unused]]
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1287.eqiad.wmnet with OS bookworm
* 20:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:55 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] (duration: 07m 03s)
* 20:46 sbisson@deploy1003: sbisson: Continuing with deployment
* 20:45 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287427{{!}}Simplewiki: include article wizard in AG experiment (T426278)]]
* 20:43 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1287.eqiad.wmnet with reason: host reimage
* 20:35 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] (duration: 10m 18s)
* 20:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1289.eqiad.wmnet with OS bookworm
* 20:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1289.eqiad.wmnet with OS bookworm
* 20:27 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1287002{{!}}Disable wgNewUserMessageOnAutoCreate on all WMF wikis (T426206)]]
* 20:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:19 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1287.eqiad.wmnet with OS bookworm
* 20:19 jsn@deploy1003: Finished scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] (duration: 07m 48s)
* 20:18 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1286.eqiad.wmnet with reason: host reimage
* 20:14 jsn@deploy1003: kgraessle, jsn: Continuing with deployment
* 20:13 jsn@deploy1003: kgraessle, jsn: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jsn@deploy1003: Started scap sync-world: Backport for [[gerrit:1192921{{!}}Enable AutoModerator on Italian Wikipedia (T405152)]], [[gerrit:1286974{{!}}Enable AutoModerator on Albanian Wikipedia (T420450)]], [[gerrit:1286975{{!}}Enable AutoModerator on Dutch Wikipedia (T425509)]]
* 20:03 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 20:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1286.eqiad.wmnet with OS bookworm
* 19:56 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1281.eqiad.wmnet with OS bookworm
* 19:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:46 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1286.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1286
* 19:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1286
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:26 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:26 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1286] - vriley@cumin1003"
* 19:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1281.eqiad.wmnet with reason: host reimage
* 19:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1274.eqiad.wmnet with OS bookworm
* 19:14 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1281.eqiad.wmnet with OS bookworm
* 18:58 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:57 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 18:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:25 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1274.eqiad.wmnet with reason: host reimage
* 18:17 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:16 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:14 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 17:32 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 17:31 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 17:23 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:17 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:17 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:16 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:15 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 17:10 cmooney@dns2005: END - running authdns-update
* 17:09 cmooney@dns2005: START - running authdns-update
* 17:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:58 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T426298|T426298]]
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:49 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:36 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 16:35 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 16:31 topranks: disable core router direct link at esams now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:25 topranks: disable core router direct link at drmrs now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:21 topranks: disable core router direct link at magru now that traffic is flowing via switches [[phab:T424611|T424611]]
* 16:20 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:20 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:16 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:14 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1288.eqiad.wmnet with OS bookworm
* 16:13 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 16:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:07 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:07 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove records for deleted IPs esams,drmrs and magru - cmooney@cumin1003"
* 16:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:04 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:59 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
* 15:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1290.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:56 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1290
* 15:55 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1290
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:55 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1290] - vriley@cumin1003"
* 15:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:51 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:50 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1288.eqiad.wmnet with reason: host reimage
* 15:49 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Release v0.11.2 - cmooney@cumin1003
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1285.eqiad.wmnet with OS bookworm
* 15:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:46 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 15:45 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1289.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1289
* 15:41 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1289] - vriley@cumin1003"
* 15:35 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 15:33 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1288.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1284.eqiad.wmnet with OS bookworm
* 15:32 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 15:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 15:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1285.eqiad.wmnet with reason: host reimage
* 15:16 bearloga@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] (duration: 06m 20s)
* 15:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:12 bearloga@deploy1003: bearloga: Continuing with deployment
* 15:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:12 bearloga@deploy1003: bearloga: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:10 bearloga@deploy1003: Started scap sync-world: Backport for [[gerrit:1287422{{!}}EventStreamConfig: fix product_metrics.web_base (T426209)]]
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1284.eqiad.wmnet with reason: host reimage
* 15:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1285.eqiad.wmnet with OS bookworm
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1288.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92544 and previous config saved to /var/cache/conftool/dbconfig/20260514-145715-fceratto.json
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1288
* 14:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1283.eqiad.wmnet with OS bookworm
* 14:54 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:54 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2010.codfw.wmnet with reason: host reimage
* 14:54 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1288
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1288] - vriley@cumin1003"
* 14:52 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1284.eqiad.wmnet with OS bookworm
* 14:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92542 and previous config saved to /var/cache/conftool/dbconfig/20260514-144707-fceratto.json
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1287.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1285.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1287] - vriley@cumin1003"
* 14:37 vriley@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host db1289
* 14:37 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1289
* 14:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92541 and previous config saved to /var/cache/conftool/dbconfig/20260514-143659-fceratto.json
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1282.eqiad.wmnet with OS bookworm
* 14:35 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:34 phuedx@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] (duration: 11m 14s)
* 14:33 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1283.eqiad.wmnet with reason: host reimage
* 14:33 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1285
* 14:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1285
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:31 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:31 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1285] - vriley@cumin1003"
* 14:29 phuedx@deploy1003: phuedx: Continuing with deployment
* 14:27 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92540 and previous config saved to /var/cache/conftool/dbconfig/20260514-142650-fceratto.json
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 14:24 phuedx@deploy1003: phuedx: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1280.eqiad.wmnet with OS bookworm
* 14:23 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:22 phuedx@deploy1003: Started scap sync-world: Backport for [[gerrit:1287368{{!}}ext.wikimediaEvents: Add synth-aa-ncs-1 experiment (T419514)]]
* 14:21 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1284.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1284
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92539 and previous config saved to /var/cache/conftool/dbconfig/20260514-141922-fceratto.json
* 14:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 14:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1283.eqiad.wmnet with OS bookworm
* 14:18 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2150 from dbctl [[phab:T424342|T424342]]', diff saved to https://phabricator.wikimedia.org/P92538 and previous config saved to /var/cache/conftool/dbconfig/20260514-141812-cwilliams.json
* 14:17 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1284
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:17 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1284] - vriley@cumin1003"
* 14:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92537 and previous config saved to /var/cache/conftool/dbconfig/20260514-141644-fceratto.json
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1282.eqiad.wmnet with reason: host reimage
* 14:14 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] (duration: 08m 00s)
* 14:13 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:09 krinkle@deploy1003: krinkle, robertsky: Continuing with deployment
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:08 krinkle@deploy1003: krinkle, robertsky: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1279.eqiad.wmnet with OS bookworm
* 14:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P92536 and previous config saved to /var/cache/conftool/dbconfig/20260514-140635-fceratto.json
* 14:06 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1287367{{!}}throttle rule for ESEAP Conference 2026 15-18 May 2026 (T426295)]]
* 14:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2010.codfw.wmnet with OS trixie
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 14:01 cwilliams@cumin1003: dbctl commit (dc=all): 'Remove db2151 from dbctl [[phab:T424343|T424343]]', diff saved to https://phabricator.wikimedia.org/P92535 and previous config saved to /var/cache/conftool/dbconfig/20260514-140110-cwilliams.json
* 14:00 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] (duration: 07m 09s)
* 13:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1282.eqiad.wmnet with OS bookworm
* 13:58 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1280.eqiad.wmnet with reason: host reimage
* 13:57 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:56 mfossati@deploy1003: mfossati: Continuing with deployment
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92534 and previous config saved to /var/cache/conftool/dbconfig/20260514-135626-fceratto.json
* 13:56 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:56 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host sretest2010.codfw.wmnet with OS trixie
* 13:56 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:55 mfossati@deploy1003: mfossati: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:54 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:53 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1287363{{!}}Scale share-highlight card to fit small viewports (T426247)]]
* 13:53 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2152.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2165 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92533 and previous config saved to /var/cache/conftool/dbconfig/20260514-135315-fceratto.json
* 13:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 13:53 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2151.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:52 cwilliams@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2150.codfw.wmnet with reason: Depooled host, will be decommissioned
* 13:49 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] (duration: 07m 03s)
* 13:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:48 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:45 krinkle@deploy1003: krinkle: Continuing with deployment
* 13:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1279.eqiad.wmnet with reason: host reimage
* 13:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:44 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1280.eqiad.wmnet with OS bookworm
* 13:42 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269442{{!}}Enable wgTrackMediaRequestProvenance on remaining Wikipedias (T414338)]]
* 13:42 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] (duration: 12m 33s)
* 13:40 elukey@cumin1003: START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:38 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:37 krinkle@deploy1003: krinkle, annet: Continuing with deployment
* 13:33 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Host will be decommissioned
* 13:33 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Host will be decommissioned
* 13:32 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Host will be decommissioned
* 13:31 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Host will be decommissioned
* 13:31 krinkle@deploy1003: krinkle, annet: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1279.eqiad.wmnet with OS bookworm
* 13:29 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1285913{{!}}Add ReadingLists Account Creation CTA campaign (T422169)]], [[gerrit:1286327{{!}}WelcomeSurvey: Respect returnTo for campaigns skipping the survey (T422169)]]
* 13:22 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1283.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1283
* 13:19 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:18 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1283
* 13:16 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] (duration: 08m 10s)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:15 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1283] - vriley@cumin1003"
* 13:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:12 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1282.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:10 sbisson@deploy1003: sbisson: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:10 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2165: Repooling after switchover
* 13:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1282
* 13:08 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287043{{!}}Enable the Article Guidance experiment on simplewiki (T426278)]]
* 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2165: Repooling after switchover
* 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Set correct weight [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92529 and previous config saved to /var/cache/conftool/dbconfig/20260514-130743-fceratto.json
* 13:07 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1282
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1282] - vriley@cumin1003"
* 13:05 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1281.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Promote db2161 to s8 primary [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92528 and previous config saved to /var/cache/conftool/dbconfig/20260514-130213-fceratto.json
* 13:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:01 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1281
* 13:00 federico3: Starting s8 codfw failover from db2165 to db2161 - [[phab:T426291|T426291]]
* 13:00 kart_: Updated cxserver to 2026-05-14-123010-production ([[phab:T426174|T426174]], [[phab:T404298|T404298]])
* 12:59 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1281
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:59 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1281] - vriley@cumin1003"
* 12:58 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 12:57 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 12:56 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 12:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1280.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:55 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 12:55 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:54 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1280
* 12:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1280
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1280] - vriley@cumin1003"
* 12:50 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1279.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Set db2161 with weight 0 [[phab:T426291|T426291]]', diff saved to https://phabricator.wikimedia.org/P92527 and previous config saved to /var/cache/conftool/dbconfig/20260514-125014-fceratto.json
* 12:49 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1279
* 12:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s8 [[phab:T426291|T426291]]
* 12:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:47 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1279
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:47 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1279] - vriley@cumin1003"
* 12:47 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 12:46 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
* 12:42 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:40 cmooney@cumin1003: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: update bgp groups for dse-k8s-wdqs - cmooney@cumin1003
* 12:31 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 28458
* 12:27 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 28458
* 12:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc3 with pc2023 as codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92526 and previous config saved to /var/cache/conftool/dbconfig/20260514-122707-marostegui.json
* 12:21 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:21 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:20 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 codfw master [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92525 and previous config saved to /var/cache/conftool/dbconfig/20260514-121958-marostegui.json
* 12:18 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2023 to pc3 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92524 and previous config saved to /var/cache/conftool/dbconfig/20260514-121839-marostegui.json
* 11:31 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 11:31 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 11:08 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 11:02 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 11:01 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
* 11:00 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:00 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bullseye
* 10:49 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bullseye
* 10:45 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2152 from dbctl [[phab:T424344|T424344]]', diff saved to https://phabricator.wikimedia.org/P92523 and previous config saved to /var/cache/conftool/dbconfig/20260514-104521-marostegui.json
* 10:41 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
* 10:40 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
* 10:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
* 10:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
* 10:27 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:25 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:19 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bullseye
* 10:14 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bullseye
* 10:14 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 10:02 cwilliams@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2152: Host will be decommissioned
* 10:02 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152: Host will be decommissioned
* 09:54 cwilliams@cumin1003: END (ERROR) - Cookbook sre.mysql.depool (exit_code=97) depool db2152.codfw.wmnet: Host will be decommissioned
* 09:51 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 09:51 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 09:49 cwilliams@cumin1003: START - Cookbook sre.mysql.depool depool db2152.codfw.wmnet: Host will be decommissioned
* 09:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bullseye
* 09:33 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bullseye
* 09:30 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bullseye
* 09:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bullseye
* 09:23 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:20 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 09:18 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:10 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 09:06 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
* 09:06 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
* 08:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 08:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bullseye
* 08:54 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bullseye
* 08:54 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bullseye
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2149 [[phab:T424341|T424341]]', diff saved to https://phabricator.wikimedia.org/P92520 and previous config saved to /var/cache/conftool/dbconfig/20260514-083916-marostegui.json
* 08:08 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 07:01 kart_: Update cxserver to 2026-04-23-114216-production ([[phab:T423002|T423002]])
* 07:00 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:00 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2013,2023].codfw.wmnet,pc1013.eqiad.wmnet with reason: Maintenance on pc3
* 06:40 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:40 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 06:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2013: Replacing HW [[phab:T418973|T418973]]
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1158: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1158: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS trixie
* 05:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
* 05:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS trixie
* 05:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1158: Reimage to Trixie
* 05:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Reimage to Trixie
* 05:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s7 master: reimage to Debian Trixie
* 05:04 marostegui@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 49s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:07 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
== 2026-05-13 ==
* 21:12 Amir1: remapping thumbsize of 0 to 2 in all group0 wikis ([[phab:T376152|T376152]])
* 21:06 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:55 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] (duration: 07m 48s)
* 20:51 jdlrobson@deploy1003: ladsgroup, jdlrobson: Continuing with deployment
* 20:49 jdlrobson@deploy1003: ladsgroup, jdlrobson: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:47 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287022{{!}}wgThumbLimits: Remove the exception for itwikiquote (T376152)]]
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:43 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] (duration: 07m 32s)
* 20:42 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 20:41 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 20:38 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:37 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:35 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287000{{!}}Handle share-highlight images w/o resizeUrl (T426215)]]
* 20:33 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] (duration: 07m 26s)
* 20:28 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 20:27 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1287006{{!}}Update small size for Swedish Wikipedia (T424910)]]
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:25 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:23 ebernhardson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] (duration: 07m 06s)
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:21 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:19 ebernhardson@deploy1003: ebernhardson: Continuing with deployment
* 20:18 ebernhardson@deploy1003: ebernhardson: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 20:17 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 20:16 ebernhardson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286997{{!}}Revert "cirrus: AB test query suggester variants" (T407432)]]
* 20:13 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] (duration: 06m 47s)
* 20:13 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 20:09 cjming@deploy1003: bpirkle, cjming: Continuing with deployment
* 20:09 cjming@deploy1003: bpirkle, cjming: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1286981{{!}}Revert "Add wikibase.v1 module to the sandbox were it is present" (T422403)]]
* 19:23 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 19:23 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 19:09 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:27 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:26 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:25 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:20 cmooney@dns2005: END - running authdns-update
* 18:19 cmooney@dns2005: START - running authdns-update
* 18:14 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:13 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:13 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:13 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new entries for ulsfo and eqsin IPs - cmooney@cumin1003"
* 18:09 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 18:01 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:00 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 17:50 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 17:50 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 17:47 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 17:47 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 17:47 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 17:43 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:42 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:36 topranks: update OSPF config on magru core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:34 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:33 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:28 mutante: zuul1001 systemctl start zuul-scheduler ; /usr/bin/docker exec zuul-scheduler zuul-scheduler smart-reconfigure
* 17:26 mutante: zuul1001 - stopping zuul-web; then manually running: /usr/sbin/usermod -u 923 zuul
* 17:26 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:26 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:24 topranks: update OSPF config on esams core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 17:20 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 17:19 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 17:05 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
* 17:03 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
* 16:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 16:43 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 16:29 topranks: update OSPF config on drmrs core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:20 topranks: update OSPF config on eqsin core routers to shift traffic to switch links [[phab:T424611|T424611]]
* 16:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 16:10 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:53 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:53 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:45 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:42 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:37 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7009.*
* 15:36 fabfur: repooling cp7009 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:32 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 15:27 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7009.*
* 15:27 fabfur: depooling cp7009 to install haproxy-awslc ([[phab:T419825|T419825]])
* 15:18 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:16 cmooney@dns2005: END - running authdns-update
* 15:15 cmooney@dns2005: START - running authdns-update
* 15:11 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:04 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:04 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:04 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 15:01 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:00 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:54 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:53 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:53 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns records for missing ulsfo subnets - cmooney@cumin1003"
* 14:51 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:50 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 14:49 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:43 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki-root1002.eqiad.wmnet with OS trixie
* 14:42 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] (duration: 07m 17s)
* 14:37 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:36 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:34 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286917{{!}}WikiEditor: Populate user_groups in EditAttemptStep events (T424010)]]
* 14:33 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-eqiad: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:33 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add missing DNS name for uslfo network new swtiches - pt1979@cumin2002"
* 14:28 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:28 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:25 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:19 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] (duration: 06m 35s)
* 14:17 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1002.eqiad.wmnet with reason: host reimage
* 14:16 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 klausman@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache-codfw: Enable Java security updates - klausman@cumin1003
* 14:15 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:15 jforrester@deploy1003: jforrester: Continuing with deployment
* 14:15 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: jforrester: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:14 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:12 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1286924{{!}}Disable wgWikiLambdaEnableAbstractClientMode everywhere (T422647)]]
* 14:11 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:09 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* 14:08 Lucas_WMDE: UTC afternoon backport+config window done
* 14:08 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* {{safesubst:SAL entry|1=14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAl}}
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:03 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Continuing with deployment
* 14:03 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp7001.*
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 14:02 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 14:01 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* {{safesubst:SAL entry|1=14:01 lucaswerkmeister-wmde@deploy1003: dragoniez, matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-AP}}
* 14:01 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 14:00 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 14:00 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki-root1002.eqiad.wmnet with OS trixie
* 13:59 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mathoid: apply
* 13:59 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-canary: Restart for upgrade to JVM 11.0.31 - eevans@cumin1003
* {{safesubst:SAL entry|1=13:59 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286890{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286897{{!}}ApiQueryGlobalUsers: Fix parsing logic for legacy log_params entries (T426033)]], [[gerrit:1286891{{!}}Add 'Promise-Non-Write-API-Action' to $wgAllowedCorsHeaders (T425972)]], [[gerrit:1286892{{!}}Add 'Promise-Non-Write-API-Action' to $wgAll}}
* 13:58 fabfur: repooling cp7001 to test haproxy-awslc behavior ([[phab:T419825|T419825]])
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 13:50 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] (duration: 07m 36s)
* 13:49 jmm@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:45 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Continuing with deployment
* 13:44 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, codenamenoreste: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1284900{{!}}Completely disable MediaWiki page patrolling functions on German Wikipedia (T316393)]]
* {{safesubst:SAL entry|1=13:40 mfossati@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers t}}
* 13:36 mfossati@deploy1003: jdlrobson, mfossati: Continuing with deployment
* {{safesubst:SAL entry|1=13:29 mfossati@deploy1003: jdlrobson, mfossati: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers that d}}
* 13:28 jmm@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Java security update - jmm@cumin2002
* 13:27 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* {{safesubst:SAL entry|1=13:27 mfossati@deploy1003: Started scap sync-world: Backport for [[gerrit:1286518{{!}}[Share Highlight] Exclude section edit links, footnotes from selection (T423658)]], [[gerrit:1286838{{!}}Add robust color fallbacks for QuoteCard average-color styling (T425358)]], [[gerrit:1286839{{!}}Fixed card width (T425710)]], [[gerrit:1286844{{!}}Adjust image size to match fixed width (T425710)]], [[gerrit:1286846{{!}}ShareHighlight: exclude browsers th}}
* 13:25 moritzm: installing openjdk-11 security updates
* 13:18 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki-root1002.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 13:12 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] (duration: 08m 18s)
* 13:07 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:05 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:04 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=pki,name=codfw
* 13:03 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286359{{!}}Add configurable user-agent and sparql endpoint url (T425389)]]
* 12:50 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] (duration: 06m 42s)
* 12:46 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 12:45 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:43 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1286884{{!}}Fix TypeError on saving userrights interwiki (T426185)]]
* 12:41 fabfur@cumin1003: conftool action : set/pooled=no; selector: name=cp7001.*
* 12:40 fabfur: depool cp7001 to test haproxy-awslc (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1286526) ([[phab:T419825|T419825]])
* 12:38 topranks: add ibgp peering between cr1-magru and cr2-magru over loopback IPs [[phab:T424611|T424611]]
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0)
* 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1236: Migration of db1236.eqiad.wmnet completed
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 12:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 12:02 topranks: add ibgp peering between cr1-esams and cr2-esams over loopback IPs [[phab:T424611|T424611]]
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:57 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:57 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update records for drmrs ibgp link - cmooney@cumin1003"
* 11:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2220: after reimage to trixie
* 11:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 11:51 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1236: Migration of db1236.eqiad.wmnet completed
* 11:44 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 11:43 jiji@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 11:43 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1236.eqiad.wmnet with OS trixie
* 11:40 topranks: delete old direct ibgp peering between cr1-drms and cr2-drmrs [[phab:T424611|T424611]]
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 11:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 11:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 11:27 topranks: add ibgp peering between cr1-drms and cr2-drmrs over loopback IPs [[phab:T424611|T424611]]
* 11:25 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:24 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 11:24 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 11:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1236.eqiad.wmnet with reason: host reimage
* 11:19 moritzm: installing Linux 6.1.170-3 on all Bookworm hosts
* 11:12 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki2002.codfw.wmnet with OS trixie
* 11:10 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2220: after reimage to trixie
* 11:06 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1236.eqiad.wmnet with OS trixie
* 11:04 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1236: Upgrading db1236.eqiad.wmnet
* 11:03 fceratto@cumin1003: START - Cookbook sre.mysql.major-upgrade
* 10:58 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS trixie
* 10:55 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:55 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 10:52 moritzm: installing Linux 5.10.251-4 on all Bullseye hosts
* 10:49 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 10:42 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pki2002.codfw.wmnet with reason: host reimage
* 10:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 10:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 10:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:33 topranks: switch eqsin core router ibgp path to route via switches [[phab:T424611|T424611]]
* 10:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
* 10:25 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host pki2002.codfw.wmnet with OS trixie
* 10:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:22 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 10:21 elukey@cumin1003: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki2002.codfw.wmnet
* 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:16 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:15 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 10:15 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:14 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:10 moritzm: installing Apache security updates on Bullseye
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:09 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 10:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS trixie
* 10:05 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 10:05 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bullseye
* 10:04 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
* 10:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2220: Reimage to Trixie
* 10:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Reimage to Trixie
* 10:02 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
* 10:01 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
* 09:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2220 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92500 and previous config saved to /var/cache/conftool/dbconfig/20260513-095934-marostegui.json
* 09:58 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2218 to s7 primary [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92499 and previous config saved to /var/cache/conftool/dbconfig/20260513-095814-marostegui.json
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:58 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 09:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bullseye
* 09:56 moritzm: installing distro-info-data updates from Bookworm point release
* 09:54 marostegui: Starting s7 codfw failover from db2220 to db2218 - [[phab:T426142|T426142]]
* 09:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 [[phab:T426142|T426142]]
* 09:53 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bullseye
* 09:53 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2218 with weight 0 [[phab:T426142|T426142]]', diff saved to https://phabricator.wikimedia.org/P92498 and previous config saved to /var/cache/conftool/dbconfig/20260513-095337-marostegui.json
* 09:51 moritzm: installing ca-certificates update from Bookworm point release
* 09:50 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bullseye
* 09:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:45 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] (duration: 09m 01s)
* 09:42 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:41 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:38 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:38 kharlan@deploy1003: kharlan: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:36 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1284633{{!}}EventStreamConfig: Register special_user_login event stream (T425631)]]
* 09:34 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
* 09:30 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
* 09:29 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
* 09:29 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
* 09:28 cmooney@dns2005: END - running authdns-update
* 09:27 cmooney@dns2005: START - running authdns-update
* 09:27 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:25 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki2002.codfw.wmnet
* 09:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:22 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki2002.codfw.wmnet with reason: reimage
* 09:21 logmsgbot: dreamyjazz Deployed security patch for [[phab:T423840|T423840]]
* 09:17 elukey@cumin1003: START - Cookbook sre.hosts.provision for host pki2002.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bullseye
* 09:17 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bullseye
* 09:17 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bullseye
* 09:14 elukey@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=pki,name=codfw
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:14 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:10 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for 2620:0:863:fe09::/64 - cmooney@cumin1003"
* 09:07 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 moritzm: installing dnsmasq security updates
* 08:40 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:38 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 08:38 cmooney@dns2005: END - running authdns-update
* 08:37 cmooney@dns2005: START - running authdns-update
* 08:36 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 08:35 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:32 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add include for 2620:0:863:fe0a::/64 - cmooney@cumin1003"
* 08:32 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 08:28 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 08:25 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 08:25 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 08:24 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] (duration: 09m 18s)
* 08:20 kharlan@deploy1003: kharlan: Continuing with deployment
* 08:16 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:14 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286805{{!}}WikimediaEvents: Enable Special:UserLogin instrumentation (T425631)]]
* 08:11 moritzm: imported dnsmasq 2.92-1~wmf13u2 to trixie-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 08:08 topranks: reconfigure link from cr4-ulsfo to asw1-22-ulsfo as 802.1q tagged [[phab:T424611|T424611]]
* 07:56 moritzm: imported dnsmasq 2.92-1~wmf12u2 to bookworm-wikimedia/main (backport of latest dnsmasq security fixes to our internal build)
* 07:47 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] (duration: 09m 09s)
* 07:43 dcausse@deploy1003: atsuko, dcausse: Continuing with deployment
* 07:40 dcausse@deploy1003: atsuko, dcausse: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:39 gkyziridis@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
* 07:39 gkyziridis@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
* 07:38 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286371{{!}}translate: add opensearch-ttmserver-test (T425377)]]
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
* 07:37 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
* 07:34 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 09m 32s)
* 07:30 dcausse@deploy1003: dcausse, wmde-fisch: Continuing with deployment
* 07:27 dcausse@deploy1003: dcausse, wmde-fisch: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:25 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286400{{!}}testwiki: Disable sub-ref's synthetic list defined refs on test wikis (T425967)]], [[gerrit:1286277{{!}}Revert^2 "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:18 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 07:18 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 07:17 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* 07:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2218: after reimage to trixie
* 07:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: after reimage to trixie
* 06:39 moritzm: installing Exim security updates on the hosts where Exim is used as a local mail relay
* 06:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2218: after reimage to trixie
* 06:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS trixie
* 06:26 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1253: after reimage to trixie
* 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1253.eqiad.wmnet with OS trixie
* 06:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
* 05:54 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1253.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1253.eqiad.wmnet with OS trixie
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS trixie
* 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1253: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Reimage to Trixie
* 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2218: Reimage to Trixie
* 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Reimage to Trixie
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1278.eqiad.wmnet with OS bookworm
* 04:20 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:20 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 04:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:57 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1278.eqiad.wmnet with reason: host reimage
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1277.eqiad.wmnet with OS bookworm
* 03:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:42 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1278.eqiad.wmnet with OS bookworm
* 03:28 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1276.eqiad.wmnet with OS bookworm
* 03:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:17 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1277.eqiad.wmnet with reason: host reimage
* 03:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1278.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1278
* 03:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:08 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1278
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 03:07 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:07 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1278] - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1276.eqiad.wmnet with reason: host reimage
* 03:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 03:02 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1277.eqiad.wmnet with OS bookworm
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:49 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1276.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1275.eqiad.wmnet with OS bookworm
* 02:37 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:35 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1277.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:28 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1277
* 02:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1277
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:25 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1277] - vriley@cumin1003"
* 02:21 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:19 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1274.eqiad.wmnet with OS bookworm
* 02:18 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1276.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:15 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1276
* 02:13 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1275.eqiad.wmnet with reason: host reimage
* 02:11 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1276
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:10 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1276] - vriley@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 44s)
* 02:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1275.eqiad.wmnet with OS bookworm
* 01:37 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] (duration: 06m 35s)
* 01:28 zabe@deploy1003: zabe: Continuing with deployment
* 01:27 zabe@deploy1003: zabe: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:27 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1274.eqiad.wmnet with OS bookworm
* 01:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1286532{{!}}Start reading from new tables everywhere except commons (2nd try) (T416548)]]
* 01:18 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1275.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:14 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 01:12 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1275
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:12 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1275] - vriley@cumin1003"
* 01:08 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1274.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:58 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1274
* 00:57 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1274
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:56 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:56 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1274] - vriley@cumin1003"
* 00:52 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1273.eqiad.wmnet with OS bookworm
* 00:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-05-12 ==
* 23:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:48 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1273.eqiad.wmnet with reason: host reimage
* 23:46 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] (duration: 12m 45s)
* 23:40 cscott@deploy1003: cscott: Continuing with deployment
* 23:39 cscott@deploy1003: cscott: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286506{{!}}Re-enable unit tests with updated output]], [[gerrit:1286516{{!}}Re-enable ContentHolderTest with updated output]], [[gerrit:1286515{{!}}Revert "Remove File::getHandler language fallback" (T425988)]]
* 23:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] (duration: 33m 28s)
* 23:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1273.eqiad.wmnet with OS bookworm
* 22:53 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:49 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1272.eqiad.wmnet with OS bookworm
* 22:40 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:40 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:32 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286514{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286513{{!}}Also merge views overflow into array-items (T426115)]], [[gerrit:1286421{{!}}Special:Preferences: Display three options for thumbsizes (T424910)]]
* 22:21 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:21 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1272.eqiad.wmnet with reason: host reimage
* 22:18 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] (duration: 34m 01s)
* 22:05 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 22:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:01 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:59 dwisehaupt@dns1004: END - running authdns-update
* 21:57 dwisehaupt@dns1004: START - running authdns-update
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1271.eqiad.wmnet with OS bookworm
* 21:50 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:46 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:43 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286456{{!}}Disable interactions until load is complete (T422968 T424787)]]
* 21:42 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1273.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:41 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1273
* 21:40 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1272.eqiad.wmnet with OS bookworm
* 21:39 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1273
* 21:38 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] (duration: 11m 56s)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:38 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:38 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1273] - vriley@cumin1003"
* 21:32 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:31 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Continuing with deployment
* 21:29 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:29 cscott@deploy1003: danielyepezgarces, cscott, vadymts1: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:26 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1283048{{!}}Enabling RSS extension for cowikimedia chapter (T425440)]], [[gerrit:1286390{{!}}Set $wgSignatureAllowedLintErrors to an empty array on Spanish Wiktionary (T425332)]]
* 21:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:23 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 21:19 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] (duration: 14m 51s)
* 21:15 cscott@deploy1003: cscott: Continuing with deployment
* 21:15 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:07 cscott@deploy1003: cscott: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Change
* 21:06 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1270.eqiad.wmnet with OS bookworm
* 21:05 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 21:05 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1286484{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T409751 T420336 T425981)]], [[gerrit:1286485{{!}}Bump wikimedia/parsoid to 0.24.0-a3 (T425981)]], [[gerrit:1286488{{!}}Disable unit tests that fail with new vendor release]], [[gerrit:1286489{{!}}Skip ContentHolderTest that fails with new vendor release]]
* 21:03 topranks: migrate link from cr1-drmrs to asw1-b13-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 21:01 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:54 topranks: migrate link from cr2-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:51 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1271.eqiad.wmnet with OS bookworm
* 20:50 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] (duration: 09m 03s)
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:46 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new link networks - cmooney@cumin1003"
* 20:46 samtar@deploy1003: samtar, dreamrimmer: Continuing with deployment
* 20:44 topranks: migrate link from cr1-drmrs to asw1-b12-drmrs to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:43 samtar@deploy1003: samtar, dreamrimmer: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:42 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1270.eqiad.wmnet with reason: host reimage
* 20:41 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 20:41 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1271.eqiad.wmnet with reason: host reimage
* 20:41 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1285482{{!}}Allow svwiki bureaucrats to remove sysop rights (T425806)]]
* 20:35 topranks: migrate link from cr2-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:26 dbrant@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] (duration: 08m 27s)
* 20:25 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1271.eqiad.wmnet with OS bookworm
* 20:23 topranks: migrate link from cr1-esams to asw1-by27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:20 dbrant@deploy1003: dbrant: Continuing with deployment
* 20:20 dbrant@deploy1003: dbrant: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:18 dbrant@deploy1003: Started scap sync-world: Backport for [[gerrit:1285930{{!}}docroot: Add "get_login_creds" permission to Android app. (T426010)]]
* 20:16 topranks: migrate link from cr2-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:15 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] (duration: 11m 47s)
* 20:11 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 20:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:05 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:05 topranks: migrate link from cr1-esams to asw1-bw27-esams to L2 trunk on the switch side [[phab:T424611|T424611]]
* 20:03 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1285905{{!}}Enforce 2FA requirements for phase 2 groups (T423119)]], [[gerrit:1286469{{!}}Prepare $wgOATH2FARequiredGroupRemovalPages for phases 2 and 3 (T423119 T423120)]]
* 20:00 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:59 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:58 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:54 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:52 topranks: migrate link from cr2-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 topranks: migrate link from cr1-magru to asw1-b4-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:34 dancy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] (duration: 07m 07s)
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
* 19:30 dancy@deploy1003: jforrester, dancy: Continuing with deployment
* 19:30 dancy@deploy1003: jforrester, dancy: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:27 dancy@deploy1003: Started scap sync-world: Backport for [[gerrit:1286464{{!}}Fix MediaHandler caching to not preserve language (T425988 T425740 T425782)]]
* 19:26 topranks: migrate link from cr2-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 19:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 19:06 topranks: migrate link from cr1-magru to asw1-b3-magru to L2 trunk on the switch side [[phab:T424611|T424611]]
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:42 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:35 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 18:08 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:56 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] (duration: 16m 08s)
* 17:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:53 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:53 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:52 otto@deploy1003: otto: Continuing with deployment
* 17:52 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:52 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:51 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:50 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:48 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:46 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:45 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:42 otto@deploy1003: otto: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:40 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1286434{{!}}EventStreamConfig - ingest mediawiki.user_change into the Data Lake (T423952)]]
* 17:39 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:39 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:38 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 17:37 brett@cumin2002: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 17:37 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 17:37 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 17:36 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 17:36 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 17:35 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub: apply
* 16:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 16:25 moritzm: installing Exim security updates on lists/vrts hosts
* 16:00 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:57 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] (duration: 07m 22s)
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:52 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:48 ladsgroup@deploy1003: ladsgroup, neriah: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup, neriah: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:45 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1286384{{!}}wikinews: Remove unnecessary settings (T421796)]]
* 15:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:37 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 15:35 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 15:34 jelto: helm uninstall -n miscweb design-strategy - [[phab:T329991|T329991]]
* 15:33 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:31 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 15:30 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 15:29 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:28 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 15:25 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 15:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 15:24 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:23 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:23 dancy@deploy1003: Installation of scap version "4.264.0" completed for 1 hosts
* 15:22 dancy@deploy1003: Installing scap version "4.264.0" for 1 host(s)
* 15:17 dancy@deploy1003: Installing scap version "4.264.0" for 163 host(s)
* 15:12 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
* 15:12 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
* 15:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1270.eqiad.wmnet with OS bookworm
* 14:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 14:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:54 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:53 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:50 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bullseye
* 14:47 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bullseye
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test2001.codfw.wmnet with OS bookworm
* 14:45 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:44 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-toolhub-test: apply
* 14:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bullseye
* 14:39 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bullseye
* 14:36 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs2009 to dse-k8s-wdqs-test2001
* 14:34 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test2001
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test2001 on all recursors
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-wdqs-test1001.eqiad.wmnet with OS bookworm
* 14:32 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs2009 to dse-k8s-wdqs-test2001 - btullis@cumin1003"
* 14:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from wdqs1028 to dse-k8s-wdqs-test1001
* 14:28 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-wdqs-test1001
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:26 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-wdqs-test1001 on all recursors
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:26 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs2009 to dse-k8s-wdqs-test2001
* 14:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming wdqs1028 to dse-k8s-wdqs-test1001 - btullis@cumin1003"
* 14:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:22 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 14:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.rename from wdqs1028 to dse-k8s-wdqs-test1001
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
* 14:20 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
* 14:20 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
* 14:19 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
* 14:17 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 14:17 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 14:15 Lucas_WMDE: UTC afternoon backport+config window done
* 14:15 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] (duration: 07m 02s)
* 14:11 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 14:10 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1271.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:10 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1271
* 14:09 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:08 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286372{{!}}Revert "page_change - add revision.revert info"]]
* 14:08 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 14:08 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bullseye
* 14:07 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bullseye
* 14:07 root@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bullseye
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 14:07 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 14:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:07 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] (duration: 39m 36s)
* 14:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 14:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1271
* 14:05 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:05 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Rolling back deployment
* 14:05 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 14:04 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1272.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:04 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1272
* 14:03 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1272
* 14:02 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:02 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 14:02 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1272] - vriley@cumin1003"
* 13:57 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:57 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:54 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 vriley@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:51 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 13:51 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1270.eqiad.wmnet with OS bookworm
* 13:50 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 13:50 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 13:49 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 13:49 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 13:49 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 13:48 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 13:48 ottomata: roll restart eventgate main to pick up mediawiki/page/change/1.4.0 schema version for [[phab:T423583|T423583]]
* 13:32 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 13:29 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde, otto: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:27 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1286336{{!}}Keep all long, non-wrapping values inside parent element (T425176)]], [[gerrit:1286341{{!}}page_change - add revision.revert info]]
* 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2233.codfw.wmnet with reason: Reboot
* 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2006.codfw.wmnet with reason: Reboot
* 13:14 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] (duration: 07m 13s)
* 13:09 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:08 sbisson@deploy1003: sbisson: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1286334{{!}}ArticleGuidance: set sparql endpoint (T425389)]]
* 12:40 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
* 12:38 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
* 12:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
* {{safesubst:SAL entry|1=12:25 dreamyjazz@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T42}}
* 12:20 dreamyjazz@deploy1003: dreamyjazz: Continuing with deployment
* 12:17 dreamyjazz@deploy1003: dreamyjazz: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425940)]] synced
* {{safesubst:SAL entry|1=12:15 dreamyjazz@deploy1003: Started scap sync-world: Backport for [[gerrit:1286328{{!}}Make DiscussionTools not show hCaptcha initially unless configured (T425955)]], [[gerrit:1286324{{!}}Show CAPTCHA if required for all edits before first edit attempt (T425955)]], [[gerrit:1286322{{!}}hCaptcha: Enable for DiscussionTools on testwiki (T426039)]], [[gerrit:1286318{{!}}hCaptcha: Enable for VisualEditor and MobileFrontend mediawikiwiki (T425}}
* 12:10 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] (duration: 07m 45s)
* 12:06 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:04 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:02 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286309{{!}}Special:UserLogin: Instrument no-JS form submissions (T425631)]]
* 10:31 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 10:31 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add new networks ibgp peering - cmooney@cumin1003"
* 09:56 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] (duration: 07m 43s)
* 09:51 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:50 kharlan@deploy1003: kharlan: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:48 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1286295{{!}}Update UserEntitySerializer callers (T426026)]]
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92480 and previous config saved to /var/cache/conftool/dbconfig/20260512-092034-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92479 and previous config saved to /var/cache/conftool/dbconfig/20260512-091025-fceratto.json
* 09:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036', diff saved to https://phabricator.wikimedia.org/P92478 and previous config saved to /var/cache/conftool/dbconfig/20260512-090017-fceratto.json
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92477 and previous config saved to /var/cache/conftool/dbconfig/20260512-085009-fceratto.json
* 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1036 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92476 and previous config saved to /var/cache/conftool/dbconfig/20260512-083526-fceratto.json
* 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1036.eqiad.wmnet with reason: Maintenance
* 08:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2150: after reimage to trixie
* 08:17 aklapper@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 08:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1231: after reimage to trixie
* 08:08 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:07 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 08:03 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] (duration: 07m 02s)
* 08:00 dcausse@deploy1003: dcausse: Rolling back deployment
* 08:00 dcausse@deploy1003: dcausse: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:56 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1286253{{!}}Revert "cirrus: use a keywork tokenizer for the plain field for autocomplete"]]
* 07:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2150: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS trixie
* 07:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1231: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1231.eqiad.wmnet with OS trixie
* 07:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
* 06:59 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1231.eqiad.wmnet with reason: host reimage
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2142.codfw.wmnet
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:46 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2142.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 06:43 jayme@deploy1003: Finished scap sync-world: update rsyslog image, [[phab:T418200|T418200]] (duration: 07m 56s)
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1231.eqiad.wmnet with OS trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 06:42 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS trixie
* 06:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1231: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Reimage to Trixie
* 06:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2150: Reimage to Trixie
* 06:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Reimage to Trixie
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2142.codfw.wmnet
* 06:36 jayme@deploy1003: Started scap sync-world: update rsyslog image, [[phab:T418200|T418200]]
* 06:27 jayme@dns1004: END - running authdns-update
* 06:26 jayme@dns1004: START - running authdns-update
* 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]] (duration: 36m 36s)
* 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.2 refs [[phab:T423911|T423911]]
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 38s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:37 eevans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 00:37 eevans@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 00:36 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:35 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:14 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:07 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] (duration: 07m 24s)
* 00:03 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 00:02 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285907{{!}}Skin: Correct thumbnail class (T424910)]]
== 2026-05-11 ==
* 23:45 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] (duration: 06m 21s)
* 23:41 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:40 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:38 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285864{{!}}Exclude sitesupport from button/icon treatment, remove manual styling (T425721)]]
* 23:24 jdlrobson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] (duration: 06m 29s)
* 23:20 jdlrobson@deploy1003: jdlrobson: Continuing with deployment
* 23:19 jdlrobson@deploy1003: jdlrobson: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:18 jdlrobson@deploy1003: Started scap sync-world: Backport for [[gerrit:1285464{{!}}Add support for icons in toolbox (T424571)]]
* 21:51 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] (duration: 06m 26s)
* 21:47 cjming@deploy1003: cjming: Continuing with deployment
* 21:47 cjming@deploy1003: cjming: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1285916{{!}}WikiLambdaApi instrument: update schema (T415254)]]
* 21:29 maryum: Deployed security fix for [[phab:T425406|T425406]]
* 21:16 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 21:16 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 21:15 mstyles@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] (duration: 06m 36s)
* 21:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 mstyles@deploy1003: sbassett, mstyles: Continuing with deployment
* 21:10 mstyles@deploy1003: sbassett, mstyles: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 mstyles@deploy1003: Started scap sync-world: Backport for [[gerrit:1284008{{!}}Enable CSPUseReportURIDirective in Wikimedia production (T424058)]]
* 21:03 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1270.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:53 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:53 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1270] - vriley@cumin1003"
* 20:49 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1269.eqiad.wmnet with OS bookworm
* 20:48 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:41 jdrewniak@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] (duration: 09m 51s)
* 20:37 jdrewniak@deploy1003: jdrewniak: Continuing with deployment
* 20:36 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:33 jdrewniak@deploy1003: jdrewniak: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:32 jdrewniak@deploy1003: Started scap sync-world: Backport for [[gerrit:1285866{{!}}Bumping portals to master (T128546)]]
* 20:19 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:15 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1269.eqiad.wmnet with reason: host reimage
* 20:02 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] (duration: 06m 57s)
* 20:00 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1269.eqiad.wmnet with OS bookworm
* 19:58 zabe@deploy1003: zabe: Continuing with deployment
* 19:57 zabe@deploy1003: zabe: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:55 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1285853{{!}}Start reading from new file tables on all small and medium wikis (T416548)]]
* 19:44 sukhe@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs1017.eqiad.wmnet with OS bullseye
* 19:43 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:40 jmm@cumin2002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bvibber out of all services on: 2453 hosts
* 19:39 inflatador: [bking@cumin2002] ~$ sudo cumin 'A:wdqs-main and A:codfw' 'systemctl restart wdqs-blazegraph' <- restart after banning scraper
* 19:25 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1269.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:24 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1269
* 19:23 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1269
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:22 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1269] - vriley@cumin1003"
* 19:18 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1268.eqiad.wmnet with OS bookworm
* 19:16 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:16 dzahn@dns1005: END - running authdns-update
* 19:14 dzahn@dns1005: START - running authdns-update
* 19:12 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 19:11 inflatador: bking@archiva1002 `sudo rm -rfv /var/cache/archiva/temp* && sudo systemctl restart archiva`. to free up disk space
* 18:56 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 18:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:49 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1268.eqiad.wmnet with reason: host reimage
* 18:25 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 18:13 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
* 18:13 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
* 18:12 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
* 18:12 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1006.eqiad.wmnet with OS trixie
* 18:12 ottomata: roll restarting eventgate-main to pick up changes for [[phab:T423952|T423952]]
* 18:07 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1268.eqiad.wmnet with OS bookworm
* 17:56 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1268.eqiad.wmnet with OS bookworm
* 17:55 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:52 sukhe@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs1017.eqiad.wmnet with OS bullseye
* 17:47 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 17:43 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 17:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1268.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92464 and previous config saved to /var/cache/conftool/dbconfig/20260511-173804-fceratto.json
* 17:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1268
* 17:34 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1268
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1268] - vriley@cumin1003"
* 17:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92463 and previous config saved to /var/cache/conftool/dbconfig/20260511-172756-fceratto.json
* 17:25 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 17:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047', diff saved to https://phabricator.wikimedia.org/P92462 and previous config saved to /var/cache/conftool/dbconfig/20260511-171747-fceratto.json
* 17:15 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
* 17:12 dancy@deploy1003: Installation of scap version "4.263.0" completed for 2 hosts
* 17:11 dancy@deploy1003: Installing scap version "4.263.0" for 2 host(s)
* 17:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92461 and previous config saved to /var/cache/conftool/dbconfig/20260511-170739-fceratto.json
* 17:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:07 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 17:06 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 17:05 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1006.eqiad.wmnet with OS trixie
* 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1047 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92460 and previous config saved to /var/cache/conftool/dbconfig/20260511-170024-fceratto.json
* 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1047.eqiad.wmnet with reason: Maintenance
* 16:56 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:51 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:50 eevans@deploy1003: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:40 eevans@deploy1003: helmfile [codfw] START helmfile.d/services/echostore: apply
* 16:39 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:39 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:38 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 16:37 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 16:36 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 16:27 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] (duration: 06m 54s)
* 16:25 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:25 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:24 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 16:23 eevans@deploy1003: helmfile [staging] START helmfile.d/services/echostore: apply
* 16:23 zabe@deploy1003: zabe: Continuing with deployment
* 16:22 zabe@deploy1003: zabe: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:20 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281506{{!}}Disable FlaggedRevs on wikinews (T423577)]]
* 16:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:02 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:00 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 16:00 eevans@deploy1003: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 15:58 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] (duration: 07m 48s)
* 15:54 zabe@deploy1003: zabe: Continuing with deployment
* 15:52 zabe@deploy1003: zabe: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:50 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281491{{!}}Remove custom user groups from Wikinews (T423578)]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:46 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] (duration: 06m 32s)
* 15:42 zabe@deploy1003: zabe: Continuing with deployment
* 15:41 zabe@deploy1003: zabe: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:40 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:39 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280418{{!}}Start reading from new file tables on testwiki (2nd try) (T416548)]]
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 15:21 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 15:17 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:55 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs2012.codfw.wmnet with reason: DIMM replacement
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:54 cdanis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:47 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:46 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:43 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host lvs1017
* 14:42 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host lvs1017
* 14:42 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 14:41 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:39 Lucas_WMDE: UTC afternoon backport+config window done
* 14:39 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]] (duration: 18
* 14:38 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 14:33 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Continuing with deployment
* {{safesubst:SAL entry|1=14:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, jforrester, matmarex, sfaci: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now}}
* 14:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285448{{!}}Grant 'createpreviouslyrenamedaccount' to account creators and sysop-likes (T196386)]], [[gerrit:1278704{{!}}WikiLambdaApi: update stream configuration (T415254)]], [[gerrit:1285352{{!}}WikiLambdaApi instrument: Sets the custom schemaID (T415254)]], [[gerrit:1285406{{!}}editSaves: getExperiment returns a promise now (T425785)]]
* {{safesubst:SAL entry|1=14:18 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (}}
* 14:15 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bookworm
* 14:12 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host lvs1017.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:05 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Continuing with deployment
* {{safesubst:SAL entry|1=14:04 lucaswerkmeister-wmde@deploy1003: matmarex, lucaswerkmeister-wmde: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group}}
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-eqiad@eqiad
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
* 13:56 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 13:50 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-eqiad@eqiad
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: dse-k8s-worker-codfw@codfw
* 13:50 btullis@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:49 btullis@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 13:47 btullis@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: dse-k8s-worker-codfw@codfw
* 13:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1285460{{!}}Prevent username registration if the username previously existed (T196386)]], [[gerrit:1285461{{!}}Prevent username registration if the username previously existed (v2) (T196386)]], [[gerrit:1285462{{!}}API: Introduce list=globalusers (T261752)]], [[gerrit:1285761{{!}}list=globalusers: Avoid querying group permissions with empty group list (T}}
* 13:36 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 13:34 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:34 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:32 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:30 btullis: restarting pybal on lvs1019 and lvs1020 for [[phab:T420437|T420437]]
* 13:26 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] (duration: 06m 28s)
* 13:25 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 13:24 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
* 13:22 jiji@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc1055.eqiad.wmnet with OS trixie
* 13:22 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Continuing with deployment
* 13:21 lucaswerkmeister-wmde@deploy1003: audreypenven, lucaswerkmeister-wmde: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 13:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for [[gerrit:1270482{{!}}Enable and configure WikiProjects prototype on Wikidata beta (T421850)]]
* 13:19 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:19 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:18 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:16 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:15 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:14 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:07 otto@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] (duration: 08m 05s)
* 13:06 elukey: remove old discovery pki intermediate
* 13:03 otto@deploy1003: otto: Continuing with deployment
* 13:01 otto@deploy1003: otto: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:59 otto@deploy1003: Started scap sync-world: Backport for [[gerrit:1285525{{!}}EventStreamConfig - add mediawiki.user_change.dev0 (T423952)]]
* 12:59 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:58 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:53 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] (duration: 12m 07s)
* 12:47 kharlan@deploy1003: kharlan: Continuing with deployment
* 12:45 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285789{{!}}hCaptcha: Enable editing on group0 wikis (T425354)]]
* 12:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:18 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS trixie
* 12:04 topranks: push out updated ACL to Nokia switches for BGP connections ([[phab:T425703|T425703]]) and add BFD config ([[phab:T425813|T425813]])
* 11:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2185.codfw.wmnet with reason: Reboot
* 11:31 moritzm: installing Linux 6.12.86 on Trixie hosts
* 11:27 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 11:27 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 11:21 jayme@deploy1003: Finished scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]] (duration: 13m 28s)
* 11:21 jayme@deploy1003: Rolling back deployment
* 11:08 jayme@deploy1003: Started scap sync-world: upgrade rsyslog on all deployments [[phab:T418200|T418200]]
* 11:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
* 10:59 jayme: uprading rsyslog to 8.2504.0-1 in all mediawiki deployments - [[phab:T418200|T418200]]
* 10:52 taavi@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Clément Goubert out of all services on: 2459 hosts
* 10:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 10:26 jayme@deploy1003: Finished scap sync-world: update rsyslog image (duration: 03m 48s)
* 10:23 jayme@deploy1003: Started scap sync-world: update rsyslog image
* 10:22 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [eqiad] START helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/services/ratelimit: apply
* 10:21 jayme@deploy1003: helmfile [codfw] START helmfile.d/services/ratelimit: apply
* 10:16 slyngs: Migrate of lvs2012 due to hardware issues
* 10:14 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:13 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 10:12 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 10:11 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] (duration: 30m 15s)
* 10:10 moritzm: rebalance routed Ganeti cluster in eqsin [[phab:T421863|T421863]]
* 10:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 10:01 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:59 kharlan@deploy1003: kharlan: Continuing with deployment
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:58 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:58 kharlan@deploy1003: kharlan: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:57 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2012.codfw.wmnet with reason: Hardware failure
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:46 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:42 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1230: [[phab:T419635|T419635]]
* 09:41 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1285731{{!}}hCaptcha: Enable for group0 wikis (T425354)]]
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:37 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92456 and previous config saved to /var/cache/conftool/dbconfig/20260511-092010-fceratto.json
* 09:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92454 and previous config saved to /var/cache/conftool/dbconfig/20260511-091001-fceratto.json
* 09:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:08 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:07 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:06 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install5004.wikimedia.org to drbd
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P92453 and previous config saved to /var/cache/conftool/dbconfig/20260511-085954-fceratto.json
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:58 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:56 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1230: [[phab:T419635|T419635]]
* 08:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92451 and previous config saved to /var/cache/conftool/dbconfig/20260511-084945-fceratto.json
* 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install5004.wikimedia.org to drbd
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92450 and previous config saved to /var/cache/conftool/dbconfig/20260511-084236-fceratto.json
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 08:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5004.eqsin.wmnet to cluster eqsin02 and group 01
* 08:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 08:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 08:10 slyngshede@dns1004: END - running authdns-update
* 08:08 slyngshede@dns1004: START - running authdns-update
* 08:05 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 08:05 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:00 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 08:00 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old eqsin ganeti cluster VIP - ayounsi@cumin1003"
* 07:56 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:55 jayme@deploy1003: helmfile [staging] DONE helmfile.d/services/ratelimit: apply
* 07:50 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:49 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:48 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:47 jayme@deploy1003: helmfile [staging] START helmfile.d/services/ratelimit: apply
* 07:24 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:23 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 07:08 elukey@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) zarcillo.discovery.wmnet on all recursors
* 07:08 elukey@cumin1003: START - Cookbook sre.dns.wipe-cache zarcillo.discovery.wmnet on all recursors
* 06:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5004.eqsin.wmnet with OS bookworm
* 06:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5004.eqsin.wmnet with reason: host reimage
* 06:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM acmechief2002.codfw.wmnet
* 06:08 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM acmechief2002.codfw.wmnet
* 06:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast6003.wikimedia.org
* 05:57 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast6003.wikimedia.org
* 05:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5004.eqsin.wmnet with OS bookworm
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 58s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-10 ==
* 18:25 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:20 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 18:11 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425504]]' ESEAP_Hub_Charter 'ESEAP Hub/Governance/Charter/Previous draft' 'Martin Urbanec' # [[phab:T425504|T425504]]
* 18:09 urbanecm@deploy1003: mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=metawiki '--reason=per [[:phab:T425503]]' ESEAP_Preparatory_Council/Proposed_theory_of_change 'ESEAP Hub/Governance/ESEAP Preparatory Council/Proposed theory of change' 'Martin Urbanec' # [[phab:T425503|T425503]]
* 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-09 ==
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 10:34 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Fix dsl column size - oblivian@cumin1003
* 10:33 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Fix dsl column size - oblivian@cumin1003"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1267.eqiad.wmnet with OS bookworm
* 01:06 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:06 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:44 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1267.eqiad.wmnet with reason: host reimage
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1267.eqiad.wmnet with OS bookworm
* 00:17 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-05-08 ==
* 23:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1267.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:35 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1267
* 23:32 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1267
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1267] - vriley@cumin1003"
* 23:26 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1266.eqiad.wmnet with OS bookworm
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 22:54 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:46 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1266.eqiad.wmnet with reason: host reimage
* 22:26 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1266.eqiad.wmnet with OS bookworm
* 22:16 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:56 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1266.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:55 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1266
* 21:53 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1266
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:51 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1266] - vriley@cumin1003"
* 21:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1265.eqiad.wmnet with OS bookworm
* 21:42 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:41 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 21:24 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 21:19 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1265.eqiad.wmnet with reason: host reimage
* 20:54 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host db1265.eqiad.wmnet with OS bookworm
* 20:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:32 vriley@cumin1003: START - Cookbook sre.hosts.provision for host db1265.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1265
* 20:30 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db1265
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:29 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:29 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [db1265] - vriley@cumin1003"
* 20:24 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 20:01 ryankemper: [WDQS] Added several more requestctl rules. They've helped marginally, but not enough to restore the service. Unless we find an obvious smoking gun, expect noise to continue for the timebeing :/
* 19:42 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 19:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 18:07 ryankemper: [WDQS] After those 2 requestctl rules, requests went down 20%, error rate decreased significantly, p50 cut almost in half, but the service is still unstable, likely we'll need to identify more throttle-candidates to restore full health
* 17:53 ryankemper: [WDQS] Deployed 2 new requestctl rules; we'll see if it helps
* 16:51 topranks: enable bfd on system0.0 sub-interface ssw1-d1-eqiad
* 15:45 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup1003.eqiad.wmnet with reason: restart
* 15:37 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup[1006,1017-1018].eqiad.wmnet with reason: restart
* 14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 14:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:51 btullis: re-pooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 10:50 btullis@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=wdqs-main,name=eqiad
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:14 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on backup1007.eqiad.wmnet with reason: restart
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:12 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:11 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:09 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:44 btullis: depooled wdqs-main in eqiad for [[phab:T425758|T425758]]
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:40 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:40 btullis@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=wdqs-main,name=eqiad
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92437 and previous config saved to /var/cache/conftool/dbconfig/20260508-093251-fceratto.json
* 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92435 and previous config saved to /var/cache/conftool/dbconfig/20260508-092243-fceratto.json
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P92434 and previous config saved to /var/cache/conftool/dbconfig/20260508-091238-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92433 and previous config saved to /var/cache/conftool/dbconfig/20260508-090230-fceratto.json
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92432 and previous config saved to /var/cache/conftool/dbconfig/20260508-085217-fceratto.json
* 08:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92431 and previous config saved to /var/cache/conftool/dbconfig/20260508-085018-fceratto.json
* 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92430 and previous config saved to /var/cache/conftool/dbconfig/20260508-084010-fceratto.json
* 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P92429 and previous config saved to /var/cache/conftool/dbconfig/20260508-083003-fceratto.json
* 08:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92428 and previous config saved to /var/cache/conftool/dbconfig/20260508-081954-fceratto.json
* 08:18 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:17 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2207 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92427 and previous config saved to /var/cache/conftool/dbconfig/20260508-080438-fceratto.json
* 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 07:59 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 07:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install5003.wikimedia.org
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install5003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2159: after reimage to trixie
* 06:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts install5003.wikimedia.org
* 06:18 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2159: after reimage to trixie
* 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2159.codfw.wmnet with OS trixie
* 06:11 moritzm: installing postorius security updates
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2159.codfw.wmnet with reason: host reimage
* 05:27 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2159.codfw.wmnet with OS trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2159: Reimage to Trixie
* 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Reimage to Trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1024.eqiad.wmnet with OS trixie
* 03:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 03:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 02:50 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:45 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1024.eqiad.wmnet with reason: host reimage
* 02:32 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1024.eqiad.wmnet with OS trixie
* 02:23 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1024.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 02:07 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1024
* 02:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1024
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:04 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:04 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1024] - vriley@cumin1003"
* 02:01 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1023.eqiad.wmnet with OS trixie
* 01:52 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:30 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 01:15 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 01:11 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1023.eqiad.wmnet with reason: host reimage
* 00:59 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1023.eqiad.wmnet with OS trixie
* 00:49 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:37 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1023
* 00:36 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1023
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 00:27 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:27 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1023] - vriley@cumin1003"
* 00:20 vriley@cumin1003: START - Cookbook sre.dns.netbox
== 2026-05-07 ==
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1022.eqiad.wmnet with OS trixie
* 23:25 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:24 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 23:09 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 23:05 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1022.eqiad.wmnet with reason: host reimage
* 22:53 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 22:25 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19] (duration: 01m 53s)
* 22:23 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (thin): Regular analytics weekly train THIN [analytics/refinery@b38efb19]
* 22:23 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19] (duration: 03m 52s)
* 22:19 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1]: Regular analytics weekly train [analytics/refinery@b38efb19]
* 22:18 amastilovic@deploy1003: Finished deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19] (duration: 01m 55s)
* 22:16 amastilovic@deploy1003: Started deploy [analytics/refinery@b38efb1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@b38efb19]
* {{safesubst:SAL entry|1=21:27 cscott@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)}}
* 21:23 cscott@deploy1003: cscott: Continuing with deployment
* 21:17 cscott@deploy1003: cscott: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]] synced to the t
* {{safesubst:SAL entry|1=21:16 cscott@deploy1003: Started scap sync-world: Backport for [[gerrit:1284828{{!}}Upgrading webonyx/graphql-php (v15.31.5 => v15.32.3)]], [[gerrit:1284834{{!}}composer.json: Update webonyx/graphql-php to ^15.32.3]], [[gerrit:1284832{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T319058 T368724 T373384 T420336 T423241 T423701 T424446 T424773 T425008 T425056 T425107 T425731)]], [[gerrit:1284837{{!}}Bump wikimedia/parsoid to 0.24.0-a2 (T425731)]}}
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1021.eqiad.wmnet with OS trixie
* 20:53 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:49 kemayo@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] (duration: 06m 38s)
* 20:48 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 20:45 kemayo@deploy1003: esanders, kemayo: Continuing with deployment
* 20:44 kemayo@deploy1003: esanders, kemayo: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be v
* 20:42 kemayo@deploy1003: Started scap sync-world: Backport for [[gerrit:1284575{{!}}Revert "Enable mobile editor abandonment survey on enwiki" (T424102)]], [[gerrit:1284702{{!}}Remove duplicate definition of EditCheckAction#isTagged (T425583)]], [[gerrit:1284703{{!}}Save action filtering info in ContentBranchNodeCheck#onDocumentChange (T425583)]]
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php commonswiki
* 20:41 Krinkle: krinkle@deploy1003$ mwscript deleteEqualMessages.php nlwiki
* 20:34 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:30 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1021.eqiad.wmnet with reason: host reimage
* 20:29 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:28 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 20:14 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] (duration: 07m 18s)
* 20:10 arlolra@deploy1003: arlolra, mmartorana: Continuing with deployment
* 20:10 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 20:09 arlolra@deploy1003: arlolra, mmartorana: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:07 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1284687{{!}}Provide page context for LintErrorChecker (T419596)]], [[gerrit:1284771{{!}}Make email confirmation banner a standalone RL module (T425677)]]
* 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1022.eqiad.wmnet with OS trixie
* 19:59 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 19:09 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1022.eqiad.wmnet with OS trixie
* 19:04 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:52 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:51 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1022
* 18:49 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1022
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:49 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:49 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1022~] - vriley@cumin1003"
* 18:45 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 18:26 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:26 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:25 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:24 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:22 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:21 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:20 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:19 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:18 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 brennen@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:06 cdanis@dns1005: END - running authdns-update
* 18:04 cdanis@dns1005: START - running authdns-update
* 18:02 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] (duration: 29m 24s)
* 18:02 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to all wikis
* 17:59 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 17:58 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 17:51 krinkle@deploy1003: krinkle: Continuing with deployment
* 17:50 krinkle@deploy1003: krinkle: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:45 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 17:45 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 17:33 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1237662{{!}}Profiler: Set explicit "excimer-wall" redis channel instead of concat]]
* 17:32 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 17:32 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 17:06 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet,db1216.eqiad.wmnet with reason: restart
* 16:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet,db1245.eqiad.wmnet with reason: restart
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:48 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:47 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 16:35 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 16:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 16:32 jynus: restarting backup1-* database primary hosts
* 16:30 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2183.codfw.wmnet,db1204.eqiad.wmnet with reason: restart
* 16:25 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 36 hosts with reason: restart
* 16:14 sukhe@dns1004: END - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:13 sukhe@dns1004: START - running authdns-update
* 16:12 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 16:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 16:01 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:50 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on ms-backup[2003-2004].codfw.wmnet,ms-backup[1003-1004].eqiad.wmnet with reason: restart
* 15:44 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox and A:ulsfo and (A:dnsbox)
* 15:32 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 15:32 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 15:31 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:31 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:24 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 15:23 sukhe@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 39 hosts
* 15:22 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 39 hosts
* 15:18 sukhe@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:18 sukhe@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 15:15 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet
* 15:12 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:12 sukhe@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo for service: upload-addrs [reason: no reason specified, no task ID specified]
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 15:06 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:05 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:05 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:03 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 15:03 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 15:03 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:01 akhatun: Deployed refinery using scap, then deployed onto hdfs
* 14:58 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-eqiad
* 14:54 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 14:54 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 14:53 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 14:52 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:52 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 14:52 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 14:50 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:44 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c] (duration: 02m 01s)
* 14:43 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 14:43 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 14:42 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (thin): Regular analytics weekly train THIN [analytics/refinery@4734c67c]
* 14:40 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c] (duration: 04m 38s)
* 14:40 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-eqiad
* 14:37 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 14:36 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 14:36 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67]: Regular analytics weekly train [analytics/refinery@4734c67c]
* 14:35 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 14:35 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 14:33 akhatun@deploy1003: Finished deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c] (duration: 01m 54s)
* 14:32 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:32 slyngshede@dns1004: END - running authdns-update
* 14:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:31 akhatun@deploy1003: Started deploy [analytics/refinery@4734c67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@4734c67c]
* 14:31 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:31 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 14:30 slyngshede@dns1004: START - running authdns-update
* 14:30 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 14:30 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 14:30 akhatun: Deploying Refinery at {{Gerrit|4734c67}} for weekly deployment train
* 14:30 jmm@dns1004: END - running authdns-update
* 14:29 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 14:28 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 14:28 jmm@dns1004: START - running authdns-update
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:28 slyngshede@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating DNS snippets - slyngshede@cumin1003"
* 14:26 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 14:26 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 14:25 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 14:24 slyngshede@cumin1003: START - Cookbook sre.dns.netbox
* 14:12 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-main-codfw
* 14:12 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 14:12 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 14:10 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
* 13:53 jasmine@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-main-codfw
* 13:34 stran@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] (duration: 09m 05s)
* 13:30 stran@deploy1003: stran: Continuing with deployment
* 13:27 stran@deploy1003: stran: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:25 stran@deploy1003: Started scap sync-world: Backport for [[gerrit:1284553{{!}}Enable staggered rollout for IRS on enwiki (T424008)]], [[gerrit:1284569{{!}}Fix when user is considered exposed to the feature in the experiment (T424075)]]
* 13:23 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:10 jforrester@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] (duration: 06m 55s)
* 13:06 jforrester@deploy1003: rzl, jforrester, hartman: Continuing with deployment
* 13:05 jforrester@deploy1003: rzl, jforrester, hartman: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 jforrester@deploy1003: Started scap sync-world: Backport for [[gerrit:1284547{{!}}Remove the progress bar]], [[gerrit:1275467{{!}}mc: Set server, instead of host and port, for wgWikiLambdaObjectCaches (T423311)]]
* 13:02 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: name=dns4004.wikimedia.org [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:58 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:55 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 12:51 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 12:51 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 12:50 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 12:45 sukhe@dns1004: FAIL - running authdns-update
* 12:44 sukhe@dns1004: START - running authdns-update
* 12:30 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1205.eqiad.wmnet with OS trixie
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5004.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install5004.wikimedia.org with OS bookworm
* 12:23 slyngshede@dns1004: FAIL - running authdns-update
* 12:21 slyngshede@dns1004: START - running authdns-update
* 12:18 moritzm: installing init-system-helpers bugfix updates from Bookworm point release
* 12:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:17 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "add lswtest back as being planned won't work - cmooney@cumin1003"
* 12:12 slyngshede@dns1004: FAIL - running authdns-update
* 12:11 slyngshede@dns1004: START - running authdns-update
* 12:11 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:11 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:11 slyngshede@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=ulsfo,service=authdns-update [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 12:08 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: after reimage to trixie
* 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 12:02 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:02 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1205.eqiad.wmnet with reason: host reimage
* 12:00 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on install5004.wikimedia.org with reason: host reimage
* 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: after reimage to trixie
* 11:47 root@cumin1003: START - Cookbook sre.hosts.reimage for host db1205.eqiad.wmnet with OS trixie
* 11:46 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1205.eqiad.wmnet with reason: reimage
* 11:43 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:43 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:40 root@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS trixie
* 11:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:36 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:35 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:20 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2168: after reimage to trixie
* 11:19 root@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:17 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS trixie
* 11:16 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:15 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:15 root@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92412 and previous config saved to /var/cache/conftool/dbconfig/20260507-111424-fceratto.json
* 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1227: after reimage to trixie
* 11:11 moritzm: instaling modsecurity-apache security updates
* 11:10 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1227.eqiad.wmnet with OS trixie
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install5004.wikimedia.org with OS bookworm
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92409 and previous config saved to /var/cache/conftool/dbconfig/20260507-110415-fceratto.json
* 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:59 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:58 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host db2184
* 10:58 root@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 10:57 root@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: START - Cookbook sre.dns.wipe-cache db2184.codfw.wmnet 129.32.192.10.in-addr.arpa 9.2.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:57 root@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] (duration: 08m 40s)
* 10:55 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:54 root@cumin1003: START - Cookbook sre.dns.netbox
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92407 and previous config saved to /var/cache/conftool/dbconfig/20260507-105407-fceratto.json
* 10:51 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 10:51 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
* 10:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:49 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:49 root@cumin1003: START - Cookbook sre.hosts.move-vlan for host db2184
* 10:48 root@cumin1003: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS trixie
* 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:48 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:47 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:47 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284592{{!}}Close Russian Wikinews (T421796)]]
* 10:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92406 and previous config saved to /var/cache/conftool/dbconfig/20260507-104359-fceratto.json
* 10:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1227.eqiad.wmnet with reason: host reimage
* 10:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2184.codfw.wmnet with reason: reimage
* 10:40 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:39 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92405 and previous config saved to /var/cache/conftool/dbconfig/20260507-103349-fceratto.json
* 10:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:32 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS trixie
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5002.wikimedia.org
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2168: Reimage to Trixie
* 10:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Reimage to Trixie
* 10:30 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2182: after reimage to trixie
* 10:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1227.eqiad.wmnet with OS trixie
* 10:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1227: Reimage to Trixie
* 10:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Reimage to Trixie
* 10:26 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: after reimage to trixie
* 10:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:21 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:20 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5002.wikimedia.org
* 10:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:14 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 10:13 moritzm: rebalance ganti cluster in ulsfo following host reimages [[phab:T424686|T424686]]
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy5001.wikimedia.org
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:11 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS trixie
* 10:10 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 10:03 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 09:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy5001.wikimedia.org
* 09:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 09:44 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2182: after reimage to trixie
* 09:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2182.codfw.wmnet with OS trixie
* 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1202: after reimage to trixie
* 09:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS trixie
* 09:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 09:32 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4003.wikimedia.org to drbd
* 09:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 09:25 elukey@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:24 elukey@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1006.eqiad.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS trixie
* 09:18 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM bast4006.wikimedia.org
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 09:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2182.codfw.wmnet with reason: host reimage
* 09:11 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM bast4006.wikimedia.org
* 09:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: After reimage
* 09:07 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2182.codfw.wmnet with OS trixie
* 08:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Reimage to Trixie
* 08:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2182: Reimage to Trixie
* 08:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Reimage to Trixie
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Reimage to Trixie
* 08:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Reimage to Trixie
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2144.codfw.wmnet
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2144.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 08:37 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 08:32 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2144.codfw.wmnet
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of prometheus4003.ulsfo.wmnet to drbd
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:28 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2144 [[phab:T425522|T425522]]', diff saved to https://phabricator.wikimedia.org/P92389 and previous config saved to /var/cache/conftool/dbconfig/20260507-082822-marostegui.json
* 08:23 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:23 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2208: After reimage
* 08:23 XioNoX: drmrs remove old v6 gateway IP
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2208: After reimage
* 08:21 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: drmrs v6 gateway IPs change - ayounsi@cumin1003"
* 08:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4004.ulsfo.wmnet to drbd
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 08:12 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 08:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4008.ulsfo.wmnet
* 08:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4008.ulsfo.wmnet
* 08:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 08:03 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 07:54 dcausse@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] (duration: 09m 46s)
* 07:49 dcausse@deploy1003: dcausse: Continuing with deployment
* 07:46 dcausse@deploy1003: dcausse: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:44 dcausse@deploy1003: Started scap sync-world: Backport for [[gerrit:1269465{{!}}search: add alt. completion indices to test keyword tokenizer (2/2) (T420427)]]
* 07:32 moritzm: installing apache2 security updates
* 07:30 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of netflow4003.ulsfo.wmnet to drbd
* 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
* 07:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:48 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.changedisk (exit_code=99) for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of ncredir4003.ulsfo.wmnet to drbd
* 06:42 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:41 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 01
* 06:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2207: after reimage to trixie
* 05:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2207: after reimage to trixie
* 05:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS trixie
* 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS trixie
* 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
* 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:04 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
* 05:03 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS trixie
* 05:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2207: Reimage to Trixie
* 05:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Reimage to Trixie
* 04:52 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2207 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92383 and previous config saved to /var/cache/conftool/dbconfig/20260507-045219-marostegui.json
* 04:51 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2204 to s2 primary [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92382 and previous config saved to /var/cache/conftool/dbconfig/20260507-045141-marostegui.json
* 04:51 marostegui: Starting s2 codfw failover from db2207 to db2204 - [[phab:T424848|T424848]]
* 04:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 [[phab:T424848|T424848]]
* 04:46 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2204 with weight 0 [[phab:T424848|T424848]]', diff saved to https://phabricator.wikimedia.org/P92381 and previous config saved to /var/cache/conftool/dbconfig/20260507-044651-marostegui.json
* 04:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 35s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:15 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] (duration: 12m 57s)
* 01:09 zabe@deploy1003: zabe: Continuing with deployment
* 01:09 zabe@deploy1003: zabe: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:02 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281485{{!}}Drop some unneeded wikinews configs (T421796)]]
* 01:01 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 00:43 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] (duration: 33m 54s)
* 00:31 zabe@deploy1003: zabe: Continuing with deployment
* 00:29 zabe@deploy1003: zabe: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1277783{{!}}Undeploy GoogleNewsSitemap (T421798)]]
== 2026-05-06 ==
* 23:41 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 23:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1021.eqiad.wmnet with OS trixie
* 23:14 ladsgroup@deploy1003: Synchronized portals: Sync portals for removal of Wikinews (duration: 02m 22s)
* 23:12 ladsgroup@deploy1003: Synchronized portals/wikipedia.org/assets: Sync portals for removal of Wikinews (duration: 06m 12s)
* 22:50 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] (duration: 07m 08s)
* 22:46 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:45 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:43 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1284004{{!}}Close Spanish Wikinews (T421796)]]
* 22:33 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] (duration: 06m 40s)
* 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:28 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:26 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283872{{!}}Close English Wikinews (T421796)]]
* 22:18 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host pc1021.eqiad.wmnet with OS trixie
* 22:14 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:14 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] (duration: 06m 25s)
* 22:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:11 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:10 cjming@deploy1003: cjming: Continuing with deployment
* 22:10 cjming@deploy1003: cjming: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:08 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1283972{{!}}UBN fix: guard entry.serverTiming before forEach (T425591)]]
* 22:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 22:05 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 22:04 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:52 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] (duration: 06m 56s)
* 21:48 zabe@deploy1003: zabe: Continuing with deployment
* 21:47 zabe@deploy1003: zabe: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:45 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1283953{{!}}Disable GNSM on dewikinews (T421798)]]
* 21:31 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:28 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:27 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 21:26 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:22 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:17 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:15 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:14 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:11 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:11 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [pc1021] - vriley@cumin1003"
* 21:07 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 21:06 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc1021
* 21:05 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host pc1021
* 21:04 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host pc1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:28 catrope@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] (duration: 09m 12s)
* 20:25 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 20:24 catrope@deploy1003: catrope, somerandomdeveloper: Continuing with deployment
* 20:21 catrope@deploy1003: catrope, somerandomdeveloper: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 catrope@deploy1003: Started scap sync-world: Backport for [[gerrit:1281526{{!}}Replace use of $wgRequest (T336703)]]
* 20:14 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 20:00 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS trixie
* 19:24 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
* 19:23 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 19:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4010.ulsfo.wmnet with OS bookworm
* 19:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 19:01 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
* 18:59 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 18:59 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 18:59 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:55 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:55 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:54 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:54 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 18:54 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:53 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:49 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
* 18:48 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 18:47 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:47 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:42 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 18:42 brennen@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:42 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 18:41 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 18:40 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 18:39 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 18:37 dzahn@dns1005: END - running authdns-update
* 18:35 dzahn@dns1005: START - running authdns-update
* 18:33 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): blockers resolved, rolling to group1
* 18:31 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 18:29 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS bookworm
* 18:02 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 18:01 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:59 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 17:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 17:55 cmooney@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 17:36 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 17:35 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 17:34 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 17:33 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 17:32 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 17:31 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 17:28 topranks: rebooting asw1-23-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 17:27 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-23-ulsfo,asw1-23-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 17:20 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 17:18 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:17 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 17:16 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 17:15 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 17:14 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 17:08 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 17:07 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 17:06 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:02 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 39 hosts with reason: ulsfo depooled for switch work
* 16:53 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-22-ulsfo,asw1-22-ulsfo IPv6 with reason: upgrading sr-linux on asw1-23-ulsfo
* 16:52 topranks: rebooting asw1-22-ulsfo to upgrade SR-Linux OS on switch [[phab:T408892|T408892]]
* 16:45 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS trixie
* 16:39 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:37 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bookworm
* 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bookworm
* 16:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:28 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:09 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 16:04 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:58 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 15:57 sukhe@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dns4004.wikimedia.org with reason: host reimage
* 15:38 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bookworm
* 15:35 sukhe@cumin1003: START - Cookbook sre.hosts.reimage for host dns4004.wikimedia.org with OS bookworm
* 15:30 jasmine@cumin2002: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:08 sukhe: sudo cumin -b1 -s5 "C:bird and not dns4004*" "run-puppet-agent --enable 'merging CR 1282958'"
* 15:08 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-eqiad cluster: Change Confluent distribution.
* 15:06 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] (duration: 06m 41s)
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 15:02 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:01 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:59 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283805{{!}}Close Chinese Wikinews (T421796)]]
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5002.eqsin.wmnet
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 14:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:45 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:36 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:35 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:34 sukhe@puppetserver1001: conftool action : set/pooled=yes; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:31 sukhe@puppetserver1001: conftool action : set/pooled=no; selector: name=dns7001.wikimedia.org [reason: testing bird change]
* 14:30 kharlan@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] (duration: 11m 16s)
* 14:28 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:26 kharlan@deploy1003: kharlan: Continuing with deployment
* 14:25 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1282958'"
* 14:23 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 14:22 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:21 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:21 kharlan@deploy1003: kharlan: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5002.eqsin.wmnet
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 14:20 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 14:20 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:19 kharlan@deploy1003: Started scap sync-world: Backport for [[gerrit:1283050{{!}}Add user_groups to editAttemptStep schema (T424010)]]
* 14:19 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:18 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum5001.eqsin.wmnet
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:15 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] (duration: 06m 40s)
* 14:13 dmartin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 dmartin@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 dmartin@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:11 dmartin@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 14:10 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:10 dmartin@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 dmartin@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:09 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:08 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:08 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283783{{!}}Close German Wikinews (T421796)]]
* 14:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] (duration: 11m 28s)
* 14:02 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum5001.eqsin.wmnet
* 14:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 13:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:55 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 13:53 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: restart to test fixes from [[phab:T425301|T425301]] - bking@cumin2002
* 13:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1194: after reimage to trixie
* 13:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283751{{!}}Close French Wikinews (T421796)]]
* 13:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:45 jgreen@dns1004: END - running authdns-update
* 13:44 alexsanford@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] (duration: 30m 53s)
* 13:44 jgreen@dns1004: START - running authdns-update
* 13:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 13:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 13:35 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.ulsfo.wmnet on all recursors
* 13:34 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.ulsfo.wmnet on all recursors
* 13:32 alexsanford@deploy1003: alexsanford: Continuing with deployment
* 13:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:31 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4010.ulsfo.wmnet with reason: host reimage
* 13:31 alexsanford@deploy1003: alexsanford: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:28 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4008.ulsfo.wmnet']
* 13:27 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:26 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:24 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:19 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 13:19 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ganeti4008.mgmt.ulsfo.wmnet on all recursors
* 13:19 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:18 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:18 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entry for ganeti4008 mgmt - cmooney@cumin1003"
* 13:15 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:14 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 13:13 alexsanford@deploy1003: Started scap sync-world: Backport for [[gerrit:1283028{{!}}Add messages related to mandatory 2FA for more groups (T423119)]]
* 13:12 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4010.ulsfo.wmnet with OS trixie
* 13:11 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:08 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
* 13:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1194: after reimage to trixie
* 13:05 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 13:01 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1194.eqiad.wmnet with OS trixie
* 12:49 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS trixie
* 12:45 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2012.codfw.wmnet with OS trixie
* 12:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: update
* 12:35 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1194.eqiad.wmnet with reason: host reimage
* 12:24 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:21 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2012.codfw.wmnet with reason: host reimage
* 12:20 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1194.eqiad.wmnet with OS trixie
* 12:20 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 12:16 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 12:16 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 12:15 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 12:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb2011.codfw.wmnet with OS trixie
* 12:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] (duration: 06m 28s)
* 12:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:05 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2012.codfw.wmnet with OS trixie
* 12:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283735{{!}}Close Polish Wikinews (T421796)]]
* 12:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:57 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4006.ulsfo.wmnet
* 11:53 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:50 moritzm: installing openjdk-17 security updates
* 11:50 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92374 and previous config saved to /var/cache/conftool/dbconfig/20260506-114919-fceratto.json
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4006.ulsfo.wmnet
* 11:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1194: Reimage to Trixie
* 11:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2160.codfw.wmnet with reason: Reboot
* 11:44 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1194: Reimage to Trixie
* 11:44 jiji@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb2011.codfw.wmnet with reason: host reimage
* 11:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Reimage to Trixie
* 11:42 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:41 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92372 and previous config saved to /var/cache/conftool/dbconfig/20260506-113910-fceratto.json
* 11:30 jiji@cumin1003: START - Cookbook sre.hosts.reimage for host rdb2011.codfw.wmnet with OS trixie
* 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048', diff saved to https://phabricator.wikimedia.org/P92371 and previous config saved to /var/cache/conftool/dbconfig/20260506-112903-fceratto.json
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jmm@cumin2002"
* 11:20 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 11:19 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 11:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92370 and previous config saved to /var/cache/conftool/dbconfig/20260506-111854-fceratto.json
* 11:14 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 11:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 11:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1217.eqiad.wmnet with reason: Reboot
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 10:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:44 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 10:39 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 10:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 10:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:23 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:22 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti4006.ulsfo.wmnet']
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1048 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92369 and previous config saved to /var/cache/conftool/dbconfig/20260506-101836-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1048.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92368 and previous config saved to /var/cache/conftool/dbconfig/20260506-101808-fceratto.json
* 10:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 10:16 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 10:10 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92367 and previous config saved to /var/cache/conftool/dbconfig/20260506-100800-fceratto.json
* 09:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040', diff saved to https://phabricator.wikimedia.org/P92366 and previous config saved to /var/cache/conftool/dbconfig/20260506-095752-fceratto.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4008.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92365 and previous config saved to /var/cache/conftool/dbconfig/20260506-094744-fceratto.json
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:40 slyngshede@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 09:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:31 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:27 jmm@cumin2002: START - Cookbook sre.hosts.provision for host ganeti4006.mgmt.ulsfo.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1040 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92364 and previous config saved to /var/cache/conftool/dbconfig/20260506-092414-fceratto.json
* 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1040.eqiad.wmnet with reason: Maintenance
* 09:23 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006/8 mgmt - ayounsi@cumin1003"
* 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92363 and previous config saved to /var/cache/conftool/dbconfig/20260506-092345-fceratto.json
* 09:17 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:17 slyngshede@cumin1003: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 09:16 ayounsi@cumin1003: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 09:15 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2005.codfw.wmnet with reason: update
* 09:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms2 [[phab:T418979|T418979]]ç', diff saved to https://phabricator.wikimedia.org/P92362 and previous config saved to /var/cache/conftool/dbconfig/20260506-091513-marostegui.json
* 09:14 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:14 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 09:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2253: Replacing HW [[phab:T418979|T418979]]
* 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92361 and previous config saved to /var/cache/conftool/dbconfig/20260506-091337-fceratto.json
* 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039', diff saved to https://phabricator.wikimedia.org/P92360 and previous config saved to /var/cache/conftool/dbconfig/20260506-090329-fceratto.json
* 09:03 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] (duration: 08m 44s)
* 08:59 zabe@deploy1003: zabe: Continuing with deployment
* 08:56 zabe@deploy1003: zabe: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:54 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281894{{!}}Correctly support new file tables in RevisionDeleteUser (T424553)]]
* 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92359 and previous config saved to /var/cache/conftool/dbconfig/20260506-085321-fceratto.json
* 08:43 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2253 to ms2 [[phab:T418973|T418973]]', diff saved to https://phabricator.wikimedia.org/P92358 and previous config saved to /var/cache/conftool/dbconfig/20260506-084337-marostegui.json
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1039 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92357 and previous config saved to /var/cache/conftool/dbconfig/20260506-083841-fceratto.json
* 08:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1039.eqiad.wmnet with reason: Maintenance
* 08:29 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:09 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 08:08 marostegui@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db2208.codfw.wmnet with OS trixie
* 08:06 awight: EU morning deployment is done
* 08:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Replacing hw
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:59 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
* 07:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Replacing HW [[phab:T418979|T418979]]
* 07:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS trixie
* 07:40 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] (duration: 08m 58s)
* 07:36 awight@deploy1003: wmde-fisch, awight, dcausse: Continuing with deployment
* 07:33 awight@deploy1003: wmde-fisch, awight, dcausse: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can
* 07:31 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283101{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]], [[gerrit:1283037{{!}}search: fix alt. completion indices to test keyword tokenizer (T420427)]], [[gerrit:1283041{{!}}search: enable Latin-to-Devanagari transliteration second-chance (T425018)]]
* 07:26 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] (duration: 07m 37s)
* 07:22 awight@deploy1003: awight, lilients: Continuing with deployment
* 07:21 awight@deploy1003: awight, lilients: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:19 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1283033{{!}}VE: Avoid counting all refs when listIndex is undefined (T425433)]]
* 07:14 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4008.ulsfo.wmnet
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4008.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:54 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1191: after reimage to trixie
* 06:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1189: after reimage to trixie
* 06:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4008.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ganeti4006.ulsfo.wmnet
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti4006.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 06:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:20 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti4006.ulsfo.wmnet
* 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2208.codfw.wmnet with reason: Idrac issues [[phab:T425506|T425506]]
* 05:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1191.eqiad.wmnet with reason: host reimage
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1189.eqiad.wmnet with reason: host reimage
* 05:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:25 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2208: Reimage to Trixie
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Reimage to Trixie
* 05:23 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1191.eqiad.wmnet with OS trixie
* 05:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1191: Reimage to Trixie
* 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Reimage to Trixie
* 05:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1189.eqiad.wmnet with OS trixie
* 05:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1189: Reimage to Trixie
* 05:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1189.eqiad.wmnet with reason: Reimage to Trixie
* 05:11 marostegui@dns1004: END - running authdns-update
* 05:09 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db1189 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92345 and previous config saved to /var/cache/conftool/dbconfig/20260506-050948-marostegui.json
* 05:09 marostegui@dns1004: START - running authdns-update
* 05:08 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db1223 to s3 primary and set section read-write [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92344 and previous config saved to /var/cache/conftool/dbconfig/20260506-050816-marostegui.json
* 05:07 marostegui@cumin1003: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92343 and previous config saved to /var/cache/conftool/dbconfig/20260506-050755-marostegui.json
* 05:06 marostegui: Starting s3 eqiad failover from db1189 to db1223 - [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T425318|T425318]]
* 05:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db1223 with weight 0 [[phab:T425318|T425318]]', diff saved to https://phabricator.wikimedia.org/P92342 and previous config saved to /var/cache/conftool/dbconfig/20260506-050342-marostegui.json
* 03:28 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 03:27 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 37s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:05 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] (duration: 06m 26s)
* 00:49 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283061{{!}}Close Dutch Wikinews (T421796)]]
* 00:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:41 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
* 00:27 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] (duration: 07m 26s)
* 00:25 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1001
* 00:25 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1001
* 00:24 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS trixie
* 00:23 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:21 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:20 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283117{{!}}Close Italian Wikinews (T421796)]]
== 2026-05-05 ==
* 23:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:30 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update ip addresses for nodes in rack 23 - pt1979@cumin2002"
* 23:26 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] (duration: 06m 58s)
* 22:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283104{{!}}Close Arabic Wikinews (T421796)]]
* 22:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] (duration: 06m 28s)
* 22:39 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:37 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283103{{!}}Close Ukrainian Wikinews (T421796)]]
* 22:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] (duration: 07m 56s)
* 22:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:20 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:18 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283100{{!}}Close Romanian Wikinews (T421796)]]
* 22:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] (duration: 06m 45s)
* 22:12 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 22:11 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283099{{!}}Close Serbian Wikinews (T421796)]]
* 22:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] (duration: 11m 07s)
* 21:59 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 21:58 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283066{{!}}Close Persian Wikinews (T421796)]]
* 21:49 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] (duration: 32m 55s)
* 21:36 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Continuing with deployment
* 21:33 arlolra@deploy1003: jdlrobson, mmartorana, arlolra: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:16 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1281501{{!}}Email confirmation banner: Remove obsolete arm_b variant (T421366)]], [[gerrit:1283056{{!}}Legacy parser no longer varies by user thumbnail size. (T417513)]]
* 20:59 dancy@deploy1003: Installation of scap version "4.262.1" completed for 2 hosts
* 20:57 dancy@deploy1003: Installing scap version "4.262.1" for 2 host(s)
* 20:57 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] (duration: 10m 59s)
* 20:52 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Continuing with deployment
* 20:48 arlolra@deploy1003: mpostoronca, h2o, awight, arlolra: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be ve
* 20:46 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1282930{{!}}hCaptcha: Add diagnostic context to script load error logs (T424496)]], [[gerrit:1282397{{!}}sectionCollapsing: Scroll to fragment target on init (T425290)]], [[gerrit:1282804{{!}}Errors added below ref list dirty when not responsive (T384599)]]
* 20:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 20:22 arlolra@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] (duration: 10m 30s)
* 20:20 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS trixie
* 20:18 arlolra@deploy1003: aaron, neriah, arlolra: Continuing with deployment
* 20:14 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:13 arlolra@deploy1003: aaron, neriah, arlolra: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 arlolra@deploy1003: Started scap sync-world: Backport for [[gerrit:1283082{{!}}Enable WikiLove on shwiki (T424891)]], [[gerrit:1276814{{!}}Add wikibase.v1 module to the sandbox were it is present (T422403)]]
* 20:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 20:09 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 20:07 pt1979@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 20:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:57 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
* 19:55 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 19:55 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:54 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:45 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1002
* 19:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1002
* 19:41 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:41 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:40 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:39 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1002
* 19:39 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1002.eqiad.wmnet 142.32.64.10.in-addr.arpa 2.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:39 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:38 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1002 - herron@cumin1003"
* 19:32 herron@cumin1003: START - Cookbook sre.dns.netbox
* 19:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 19:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:30 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1002
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS trixie
* 19:17 dancy@deploy1003: Installation of scap version "4.262.0" completed for 2 hosts
* 19:15 dancy@deploy1003: Installing scap version "4.262.0" for 2 host(s)
* 19:15 cmooney@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: rebooting firewall in desperation
* 19:14 brennen@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 19:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:05 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "set correct vlan group in netbox for new ulsfo vlans - cmooney@cumin1003 - [[phab:T408892|T408892]]"
* 19:04 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 19:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] (duration: 10m 59s)
* 18:56 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:52 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283063{{!}}Close Swedish Wikinews (T421796)]]
* 18:49 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:47 brennen@deploy1003: Finished scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]] (duration: 36m 04s)
* 18:44 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update cp4038 ip address - pt1979@cumin2002"
* 18:40 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:30 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:25 pt1979@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4038.ulsfo.wmnet with OS trixie
* 18:14 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-codfw
* 18:13 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-codfw
* 18:13 pt1979@cumin1003: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 18:11 brennen@deploy1003: Started scap sync-world: testwikis to 1.47.0-wmf.1 refs [[phab:T423910|T423910]]
* 18:10 cmooney@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device pfw1a-eqiad
* 18:10 cmooney@cumin1003: START - Cookbook sre.network.tls for network device pfw1a-eqiad
* 18:06 brennen: 1.47.0-wmf.1 train status ([[phab:T423910|T423910]]): no current blockers, rolling to group0
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:38 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1003.eqiad.wmnet with reason: host reimage
* 17:33 herron@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 17:32 herron@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 17:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1003
* 17:21 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1003
* 17:20 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:19 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:16 herron@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging1003.eqiad.wmnet 66.48.64.10.in-addr.arpa 6.6.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:15 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging1003 - herron@cumin1003"
* 17:12 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:09 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1003
* 17:08 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1003.eqiad.wmnet with OS trixie
* 17:05 sukhe: sudo cumin -b11 "A:cp and not P<nowiki>{</nowiki>cp2041* or cp2042*<nowiki>}</nowiki> and not A:ulsfo" "run-puppet-agent --enable 'merging CR 1282979'"
* 16:58 sbassett@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] (duration: 07m 25s)
* 16:53 sbassett@deploy1003: mstyles, sbassett: Continuing with deployment
* 16:52 sbassett@deploy1003: mstyles, sbassett: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdeb
* 16:50 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]], [[gerrit:1283049{{!}}Remove undefined variable $wmgUseCSPReportOnlyHasSession (T419612 T420604 T420607)]]
* 16:38 sbassett@deploy1003: Started scap sync-world: Backport for [[gerrit:1283036{{!}}Set $wgReauthenticateTime editsitejs to one hour (T197137)]], [[gerrit:1283020{{!}}Set CSP to enforce with allow-listed domains in Wikimedia production (T419612 T420604 T420607)]]
* 16:19 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 16:19 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 16:18 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 16:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] (duration: 06m 16s)
* 16:07 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:07 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:05 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283040{{!}}Close Japanese Wikinews (T421796)]]
* 16:01 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] (duration: 07m 53s)
* 15:57 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:55 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 15:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:55 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 15:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 15:54 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 15:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283039{{!}}Close Korean Wikinews (T421796)]]
* 15:52 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] (duration: 06m 12s)
* 15:48 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:47 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283038{{!}}Close Finnish Wikinews (T421796)]]
* 15:42 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 15:42 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 15:39 dzahn@dns1005: END - running authdns-update
* 15:38 mutante: deleting mwmaint.discovery.wmnet DNS entry - the hosts behind it dont exist anymore
* 15:37 dzahn@dns1005: START - running authdns-update
* 15:24 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:24 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:21 dcausse@deploy1003: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 dcausse@deploy1003: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 15:20 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] (duration: 06m 17s)
* 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92340 and previous config saved to /var/cache/conftool/dbconfig/20260505-151930-fceratto.json
* 15:16 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:16 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283024{{!}}Close Czech Wikinews (T421796)]]
* 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92339 and previous config saved to /var/cache/conftool/dbconfig/20260505-150921-fceratto.json
* 15:08 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] (duration: 07m 06s)
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:04 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:03 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:01 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1283003{{!}}Close Tamil Wikinews (T421796)]]
* 14:59 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] (duration: 07m 48s)
* 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P92338 and previous config saved to /var/cache/conftool/dbconfig/20260505-145913-fceratto.json
* 14:58 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:57 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:55 urbanecm@deploy1003: urbanecm: Continuing with deployment
* 14:53 urbanecm@deploy1003: urbanecm: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92337 and previous config saved to /var/cache/conftool/dbconfig/20260505-145231-fceratto.json
* 14:51 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1283002{{!}}fix: wrong property name action_data (T425425)]]
* 14:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92336 and previous config saved to /var/cache/conftool/dbconfig/20260505-144905-fceratto.json
* 14:44 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92335 and previous config saved to /var/cache/conftool/dbconfig/20260505-144223-fceratto.json
* 14:42 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 14:41 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 14:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92334 and previous config saved to /var/cache/conftool/dbconfig/20260505-144029-fceratto.json
* 14:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 14:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92333 and previous config saved to /var/cache/conftool/dbconfig/20260505-143958-fceratto.json
* 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P92332 and previous config saved to /var/cache/conftool/dbconfig/20260505-143214-fceratto.json
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=eqiad
* 14:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92331 and previous config saved to /var/cache/conftool/dbconfig/20260505-142949-fceratto.json
* 14:28 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master1001.eqiad.wmnet
* 14:25 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1004.eqiad.wmnet with reason: host reimage
* 14:24 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master1001.eqiad.wmnet
* 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92329 and previous config saved to /var/cache/conftool/dbconfig/20260505-142206-fceratto.json
* 14:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P92328 and previous config saved to /var/cache/conftool/dbconfig/20260505-141941-fceratto.json
* 14:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 14:11 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1004
* 14:10 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS trixie
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1015.eqiad.wmnet
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1015.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92327 and previous config saved to /var/cache/conftool/dbconfig/20260505-140933-fceratto.json
* 14:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 14:07 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:07 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:06 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:05 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=eqiad
* 14:05 jmm@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=config-master,name=codfw
* 14:04 elukey@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 14:04 elukey@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 14:03 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 14:03 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 14:03 Lucas_WMDE: UTC afternoon backport+config window done
* 14:03 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM config-master2001.codfw.wmnet
* 14:02 jasmine@cumin2002: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 14:01 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1015.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1014.eqiad.wmnet
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:01 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:01 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92326 and previous config saved to /var/cache/conftool/dbconfig/20260505-140047-fceratto.json
* 14:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92325 and previous config saved to /var/cache/conftool/dbconfig/20260505-140016-fceratto.json
* 13:59 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Repooling
* 13:59 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM config-master2001.codfw.wmnet
* 13:58 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:58 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:55 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] (duration: 06m 22s)
* 13:50 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1014.eqiad.wmnet
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92323 and previous config saved to /var/cache/conftool/dbconfig/20260505-135008-fceratto.json
* 13:50 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 13:49 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 jmm@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=config-master,name=codfw
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1011.eqiad.wmnet
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:47 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:47 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282988{{!}}Close Portuguese Wikinews (T421796)]]
* 13:47 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1011.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2209 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92321 and previous config saved to /var/cache/conftool/dbconfig/20260505-134522-fceratto.json
* 13:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2209.codfw.wmnet with reason: Maintenance
* 13:44 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Repooling
* 13:44 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:43 jasmine@cumin2002: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-main-codfw cluster: Change Confluent distribution.
* 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92319 and previous config saved to /var/cache/conftool/dbconfig/20260505-134257-fceratto.json
* 13:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P92318 and previous config saved to /var/cache/conftool/dbconfig/20260505-134000-fceratto.json
* 13:37 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1011.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aqs1010.eqiad.wmnet
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:37 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:37 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aqs1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1003"
* 13:33 eevans@cumin1003: START - Cookbook sre.dns.netbox
* 13:30 Msz2001: UTC afternoon backport window done
* 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92317 and previous config saved to /var/cache/conftool/dbconfig/20260505-132952-fceratto.json
* 13:27 eevans@cumin1003: START - Cookbook sre.hosts.decommission for hosts aqs1010.eqiad.wmnet
* 13:24 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 13:23 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 13:23 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] (duration: 08m 37s)
* 13:23 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 13:22 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 13:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on dborch1002.wikimedia.org with reason: [[phab:T416582|T416582]]
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92316 and previous config saved to /var/cache/conftool/dbconfig/20260505-132002-fceratto.json
* 13:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 13:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92315 and previous config saved to /var/cache/conftool/dbconfig/20260505-131931-fceratto.json
* 13:19 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Continuing with deployment
* 13:16 mszwarc@deploy1003: mszwarc, jhsoby, matmarex, d3r1ck01: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug
* 13:15 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1270882{{!}}Remove temporary `wgOAuth2UsePrefixedSub` feature flag (T417690)]], [[gerrit:1271969{{!}}Move privileged global and local group handling to WikimediaCustomizations (T418507)]], [[gerrit:1281964{{!}}Add Akan (ak) to wmgExtraLanguageNames by default (T333765 T425256)]]
* 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 13:11 mszwarc@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] (duration: 07m 55s)
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:11 atsuko@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92314 and previous config saved to /var/cache/conftool/dbconfig/20260505-130923-fceratto.json
* 13:07 mszwarc@deploy1003: mszwarc: Continuing with deployment
* 13:05 mszwarc@deploy1003: mszwarc: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy1003: Started scap sync-world: Backport for [[gerrit:1282850{{!}}Switch 'autoconfirmed' to use APCOND_AGE_FROM_EDIT on certain wikis (T418484)]]
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P92313 and previous config saved to /var/cache/conftool/dbconfig/20260505-125915-fceratto.json
* 12:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] (duration: 07m 23s)
* 12:52 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 12:50 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:49 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282970{{!}}Close Esperanto Wikinews (T421796)]]
* 12:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92312 and previous config saved to /var/cache/conftool/dbconfig/20260505-124907-fceratto.json
* 12:44 sgimeno@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] (duration: 03m 56s)
* 12:43 sgimeno@deploy1003: sgimeno: Continuing with deployment
* 12:42 moritzm: installing node-tar security updates
* 12:41 sgimeno@deploy1003: sgimeno: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:40 sgimeno@deploy1003: Started scap sync-world: Backport for [[gerrit:1280226{{!}}loggedOutWarning: instrument browser navigation and tab close (T421518)]]
* 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92311 and previous config saved to /var/cache/conftool/dbconfig/20260505-124041-fceratto.json
* 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 12:36 moritzm: installing imagemagick security updates
* 12:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92310 and previous config saved to /var/cache/conftool/dbconfig/20260505-123411-fceratto.json
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:33 atsuko@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ttmserver-test: apply
* 12:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 12:29 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 12:28 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 12:26 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92309 and previous config saved to /var/cache/conftool/dbconfig/20260505-122404-fceratto.json
* 12:23 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 12:23 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
* 12:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P92308 and previous config saved to /var/cache/conftool/dbconfig/20260505-121352-fceratto.json
* 12:04 moritzm: installing postgresql-13 security updates
* 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92307 and previous config saved to /var/cache/conftool/dbconfig/20260505-120344-fceratto.json
* 11:57 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] (duration: 06m 13s)
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2005.codfw.wmnet
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92306 and previous config saved to /var/cache/conftool/dbconfig/20260505-115535-fceratto.json
* 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92305 and previous config saved to /var/cache/conftool/dbconfig/20260505-115503-fceratto.json
* 11:53 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:53 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:52 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2005.codfw.wmnet
* 11:51 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282960{{!}}Close Shan Wikinews (T421796)]]
* 11:47 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] (duration: 09m 21s)
* 11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2004.codfw.wmnet
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92304 and previous config saved to /var/cache/conftool/dbconfig/20260505-114455-fceratto.json
* 11:43 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 11:43 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2004.codfw.wmnet
* 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd2003.codfw.wmnet
* 11:39 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:38 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd2003.codfw.wmnet
* 11:38 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282955{{!}}Close Norwegian Wikinews (T421796)]]
* 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P92303 and previous config saved to /var/cache/conftool/dbconfig/20260505-113446-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92302 and previous config saved to /var/cache/conftool/dbconfig/20260505-112449-fceratto.json
* 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92301 and previous config saved to /var/cache/conftool/dbconfig/20260505-112438-fceratto.json
* 11:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92300 and previous config saved to /var/cache/conftool/dbconfig/20260505-111616-fceratto.json
* 11:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92299 and previous config saved to /var/cache/conftool/dbconfig/20260505-111545-fceratto.json
* 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92298 and previous config saved to /var/cache/conftool/dbconfig/20260505-111435-fceratto.json
* 11:10 moritzm: installing ca-certificates updates from bookworm point release
* 11:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2221: after reimage to trixie
* 11:07 moritzm: installing multipart bugfix updates from bookworm point release
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92296 and previous config saved to /var/cache/conftool/dbconfig/20260505-110537-fceratto.json
* 11:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:05 ayounsi@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading P<nowiki>{</nowiki>lvs4009*<nowiki>}</nowiki> and A:liberica
* 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P92295 and previous config saved to /var/cache/conftool/dbconfig/20260505-110427-fceratto.json
* 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1174: after reimage to trixie
* 10:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P92293 and previous config saved to /var/cache/conftool/dbconfig/20260505-105529-fceratto.json
* 10:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92291 and previous config saved to /var/cache/conftool/dbconfig/20260505-105419-fceratto.json
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 10:50 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:49 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92290 and previous config saved to /var/cache/conftool/dbconfig/20260505-104521-fceratto.json
* 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92288 and previous config saved to /var/cache/conftool/dbconfig/20260505-104032-fceratto.json
* 10:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92286 and previous config saved to /var/cache/conftool/dbconfig/20260505-103702-fceratto.json
* 10:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 10:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92285 and previous config saved to /var/cache/conftool/dbconfig/20260505-103632-fceratto.json
* 10:32 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:29 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 10:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92283 and previous config saved to /var/cache/conftool/dbconfig/20260505-102623-fceratto.json
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2221: after reimage to trixie
* 10:24 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:23 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:23 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:22 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2221.codfw.wmnet with OS trixie
* 10:17 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P92281 and previous config saved to /var/cache/conftool/dbconfig/20260505-101616-fceratto.json
* 10:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1174: after reimage to trixie
* 09:42 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:41 jelto@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 09:39 jelto@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 09:38 jelto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 09:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P92271 and previous config saved to /var/cache/conftool/dbconfig/20260505-093703-fceratto.json
* 09:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92270 and previous config saved to /var/cache/conftool/dbconfig/20260505-093619-fceratto.json
* 09:36 jelto@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 09:35 aikochou@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1173 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92269 and previous config saved to /var/cache/conftool/dbconfig/20260505-093305-fceratto.json
* 09:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
* 09:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS trixie
* 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2221.codfw.wmnet with OS trixie
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 09:30 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1174: Reimage to Trixie
* 09:29 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2221: Reimage to Trixie
* 09:29 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1174: Reimage to Trixie
* 09:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2221: Reimage to Trixie
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Reimage to Trixie
* 09:28 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Reimage to Trixie
* 09:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92265 and previous config saved to /var/cache/conftool/dbconfig/20260505-092654-fceratto.json
* 09:26 jelto@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 09:25 jelto@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92264 and previous config saved to /var/cache/conftool/dbconfig/20260505-092431-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92263 and previous config saved to /var/cache/conftool/dbconfig/20260505-091808-fceratto.json
* 09:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92262 and previous config saved to /var/cache/conftool/dbconfig/20260505-091423-fceratto.json
* 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92260 and previous config saved to /var/cache/conftool/dbconfig/20260505-091254-fceratto.json
* 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213', diff saved to https://phabricator.wikimedia.org/P92259 and previous config saved to /var/cache/conftool/dbconfig/20260505-090415-fceratto.json
* 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92258 and previous config saved to /var/cache/conftool/dbconfig/20260505-090246-fceratto.json
* 08:58 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2209: after reimage to trixie
* 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92256 and previous config saved to /var/cache/conftool/dbconfig/20260505-085407-fceratto.json
* 08:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS trixie
* 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P92255 and previous config saved to /var/cache/conftool/dbconfig/20260505-085238-fceratto.json
* 08:50 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 08:50 moritzm: installing augeas security updates
* 08:49 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:48 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2213 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92254 and previous config saved to /var/cache/conftool/dbconfig/20260505-084616-fceratto.json
* 08:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 08:42 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92253 and previous config saved to /var/cache/conftool/dbconfig/20260505-084231-fceratto.json
* 08:41 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 08:40 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 08:38 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:37 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 08:37 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 08:35 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 08:34 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 08:34 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 13 hosts with reason: switches replacement
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92252 and previous config saved to /var/cache/conftool/dbconfig/20260505-083356-fceratto.json
* 08:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92251 and previous config saved to /var/cache/conftool/dbconfig/20260505-083326-fceratto.json
* 08:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 08:32 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 08:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:28 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 08:24 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92250 and previous config saved to /var/cache/conftool/dbconfig/20260505-082318-fceratto.json
* 08:22 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2222: after reimage to trixie
* 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
* 08:16 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/x-flac # [[phab:T414641|T414641]]
* 08:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: after reimage to trixie
* 08:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P92247 and previous config saved to /var/cache/conftool/dbconfig/20260505-081309-fceratto.json
* 08:08 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --broken-only --mediatype AUDIO --mime audio/flac # [[phab:T414641|T414641]]
* 08:05 ayounsi@dns1004: END - running authdns-update
* 08:03 ayounsi@dns1004: START - running authdns-update
* 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92245 and previous config saved to /var/cache/conftool/dbconfig/20260505-080301-fceratto.json
* 08:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS trixie
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:01 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ulsfo includes - ayounsi@cumin1003"
* 08:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2209: Reimage to Trixie
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2209.codfw.wmnet with reason: Reimage to Trixie
* 07:58 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 07:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2209 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92243 and previous config saved to /var/cache/conftool/dbconfig/20260505-075746-marostegui.json
* 07:56 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2205 to s3 primary [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92242 and previous config saved to /var/cache/conftool/dbconfig/20260505-075654-marostegui.json
* 07:55 awight: EU morning deployment was fun
* 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92241 and previous config saved to /var/cache/conftool/dbconfig/20260505-075416-fceratto.json
* 07:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:52 marostegui: Starting s3 codfw failover from db2209 to db2205 - [[phab:T424864|T424864]]
* 07:51 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2205 with weight 0 [[phab:T424864|T424864]]', diff saved to https://phabricator.wikimedia.org/P92239 and previous config saved to /var/cache/conftool/dbconfig/20260505-075156-marostegui.json
* 07:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s3 [[phab:T424864|T424864]]
* 07:50 zabe: zabe@deploy1003:~$ foreachwiki refreshImageMetadata --force --mediatype AUDIO --mime audio/midi # [[phab:T414645|T414645]]
* 07:45 zabe: zabe@deploy1003:~$ mwscript namespaceDupes.php scnwiki --fix # [[phab:T425378|T425378]]
* 07:36 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2222: after reimage to trixie
* 07:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2222.codfw.wmnet with OS trixie
* 07:30 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1170: after reimage to trixie
* 07:26 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS trixie
* 07:11 awight@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] (duration: 06m 43s)
* 07:07 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:07 awight@deploy1003: awight, 1f616emo: Continuing with deployment
* 07:06 awight@deploy1003: awight, 1f616emo: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:05 awight@deploy1003: Started scap sync-world: Backport for [[gerrit:1281967{{!}}zhwikinews: (2/2) revert 20th anniversary logo change (assets) (T420165)]]
* 07:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 07:03 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 07:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2222.codfw.wmnet with reason: host reimage
* 07:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1156: after reimage to trixie
* 06:58 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 06:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 06:58 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS trixie
* 06:44 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2222.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Reimage to Trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Reimage to Trixie
* 06:42 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2222: Reimage to Trixie
* 06:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Reimage to Trixie
* 06:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1156: after reimage to trixie
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS trixie
* 05:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:46 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:43 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: patterns_as_inline_patterns - oblivian@cumin1003
* 05:42 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "patterns_as_inline_patterns - oblivian@cumin1003"
* 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS trixie
* 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1156: Reimage to Trixie
* 05:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1156.eqiad.wmnet with reason: Reimage to Trixie
* 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s2 master: reimage to Debian Trixie
* 04:03 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.23 (duration: 03m 12s)
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 39s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 01:21 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:21 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns infor for new switches - pt1979@cumin2002"
* 01:16 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] (duration: 06m 50s)
* 00:11 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 00:10 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282434{{!}}Close Catalan Wikinews (T421796)]]
== 2026-05-04 ==
* 23:48 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:46 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282432{{!}}Close Bosnian Wikinews (T421796)]]
* 23:14 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] (duration: 06m 45s)
* 23:10 ladsgroup@deploy1003: neriah, ladsgroup: Continuing with deployment
* 23:09 ladsgroup@deploy1003: neriah, ladsgroup: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:07 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282410{{!}}Close Hebrew Wikinews (T421796)]]
* 22:08 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 22:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 21:43 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:32 cwhite@deploy1003: Finished deploy [statsv/statsv@152de49]: fix logging (duration: 00m 11s)
* 21:32 cwhite@deploy1003: Started deploy [statsv/statsv@152de49]: fix logging
* 21:20 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] (duration: 11m 20s)
* 21:16 cjming@deploy1003: cjming, neriah: Continuing with deployment
* 21:10 cjming@deploy1003: cjming, neriah: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1276432{{!}}Enable Hebrew keyboard DWIM for namespace resolution on hewikis (T412468)]]
* 20:38 cjming@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] (duration: 22m 19s)
* 20:34 cjming@deploy1003: mmartorana, cjming: Continuing with deployment
* 20:18 cjming@deploy1003: mmartorana, cjming: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:16 cjming@deploy1003: Started scap sync-world: Backport for [[gerrit:1282385{{!}}Revert^2 "Use js promise for email confirmation banner"]]
* 20:11 toyofuku@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] (duration: 07m 21s)
* 20:07 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1005.eqiad.wmnet with OS trixie
* 20:06 toyofuku@deploy1003: toyofuku: Continuing with deployment
* 20:05 toyofuku@deploy1003: toyofuku: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 toyofuku@deploy1003: Started scap sync-world: Backport for [[gerrit:1277667{{!}}Enable the reading list beta feature survey on all wikipedias (T421776)]]
* 19:51 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) asw1-22-ulsfo.wikimedia.org on all recursors
* 19:50 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache asw1-22-ulsfo.wikimedia.org on all recursors
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: asw1-22-ulsfo - ayounsi@cumin1003"
* 19:48 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:44 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 19:42 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1005.eqiad.wmnet with reason: host reimage
* 19:40 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:37 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:28 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: ongoing troubleshooting
* 19:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging1005
* 19:27 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging1005.eqiad.wmnet with OS trixie
* 19:23 root@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:23 root@deploy1003: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 19:23 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: remove privatemounts to see if it helps - bking@cumin2002 - [[phab:T424852|T424852]]
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 19:06 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 18:59 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] (duration: 06m 16s)
* 18:55 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:55 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:53 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282407{{!}}Close Limburgish Wikinews (T421796)]]
* 18:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] (duration: 09m 17s)
* 18:27 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 18:23 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:22 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282405{{!}}Close Albanian Wikinews (T421796)]]
* 18:11 dancy@deploy1003: Finished scap sync-world: testing (duration: 02m 04s)
* 18:11 dancy@deploy1003: dancy: Rolling back deployment
* 18:10 dancy@deploy1003: dancy: testing synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:09 dancy@deploy1003: Started scap sync-world: testing
* 18:08 dancy@deploy1003: Installation of scap version "4.260.0" completed for 2 hosts
* 18:06 dancy@deploy1003: Installing scap version "4.260.0" for 2 host(s)
* 17:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:41 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 17:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:40 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:39 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:34 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:33 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:33 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 16:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] (duration: 06m 19s)
* 16:00 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 16:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282384{{!}}Close Greek Wikinews (T421796)]]
* 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92224 and previous config saved to /var/cache/conftool/dbconfig/20260504-155514-fceratto.json
* 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92223 and previous config saved to /var/cache/conftool/dbconfig/20260504-154506-fceratto.json
* 15:38 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] (duration: 06m 59s)
* 15:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92222 and previous config saved to /var/cache/conftool/dbconfig/20260504-153458-fceratto.json
* 15:34 ladsgroup@deploy1003: ladsgroup, chlod: Continuing with deployment
* 15:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on 39 hosts with reason: switches replacement
* 15:33 ladsgroup@deploy1003: ladsgroup, chlod: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:32 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
* 15:32 elukey@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
* 15:31 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282060{{!}}Make errorpages responsive]]
* 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92221 and previous config saved to /var/cache/conftool/dbconfig/20260504-152449-fceratto.json
* 15:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92220 and previous config saved to /var/cache/conftool/dbconfig/20260504-152238-fceratto.json
* 15:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 15:20 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:17 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:17 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:13 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install5004.wikimedia.org
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92219 and previous config saved to /var/cache/conftool/dbconfig/20260504-151238-fceratto.json
* 15:10 papaul: ongoing switch refresh in ULSFO
* 15:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:10 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:06 elukey@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:05 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] (duration: 06m 45s)
* 15:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92218 and previous config saved to /var/cache/conftool/dbconfig/20260504-150230-fceratto.json
* 15:01 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 15:00 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:58 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1282381{{!}}Close Gun Wikinews (T421796)]]
* 14:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229', diff saved to https://phabricator.wikimedia.org/P92217 and previous config saved to /var/cache/conftool/dbconfig/20260504-145222-fceratto.json
* 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92216 and previous config saved to /var/cache/conftool/dbconfig/20260504-144213-fceratto.json
* 14:41 pt1979@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 7 hosts
* 14:41 pt1979@cumin1003: START - Cookbook sre.hosts.remove-downtime for 7 hosts
* 14:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:34 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
* 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2229 ([[phab:T419635|T419635]])', diff saved to https://phabricator.wikimedia.org/P92215 and previous config saved to /var/cache/conftool/dbconfig/20260504-143334-fceratto.json
* 14:33 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2229.codfw.wmnet with reason: Maintenance
* 14:30 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cr[3-4]-ulsfo IPv6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPv6 with reason: switch refresh
* 14:28 pt1979@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 4:00:00 on cr[3-4]-ulsfo IPV6,cr[3-4]-ulsfo.mgmt,mr1-ulsfo IPV6 with reason: switch refresh
* 14:25 pt1979@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on asw2-ulsfo,cr[3-4]-ulsfo,mr1-ulsfo with reason: switch refresh
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2001
* 14:16 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2001
* 14:13 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2001
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2001.codfw.wmnet 94.0.192.10.in-addr.arpa 4.9.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:13 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:13 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2001 - herron@cumin1003"
* 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92214 and previous config saved to /var/cache/conftool/dbconfig/20260504-141113-fceratto.json
* 14:07 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2001
* 14:04 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS trixie
* 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92213 and previous config saved to /var/cache/conftool/dbconfig/20260504-140105-fceratto.json
* 14:00 slyngshede@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=ulsfo [reason: ulsfo switch refresh [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 14:00 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: New switch configuration, [[phab:T408892|T408892]]]
* 13:59 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] (duration: 06m 22s)
* 13:57 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install5004.wikimedia.org on all recursors
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install5004.wikimedia.org - jmm@cumin2002"
* 13:55 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 13:55 sbisson@deploy1003: sbisson: Continuing with deployment
* 13:55 sbisson@deploy1003: sbisson: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:54 dcausse: [[phab:T425301|T425301]]: stopping writes again on cloudelastic, cluster unstable
* 13:53 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1282354{{!}}ArticleGuidance: enable on simple english (T425351)]]
* 13:52 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:52 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install5004.wikimedia.org
* 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P92212 and previous config saved to /var/cache/conftool/dbconfig/20260504-135056-fceratto.json
* 13:50 sbisson@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] (duration: 07m 30s)
* 13:46 sbisson@deploy1003: 1f616emo, sbisson: Continuing with deployment
* 13:45 sbisson@deploy1003: 1f616emo, sbisson: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 sbisson@deploy1003: Started scap sync-world: Backport for [[gerrit:1281965{{!}}zhwikinews: (1/2) revert 20th anniversary logo change (config) (T420165)]]
* 13:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92211 and previous config saved to /var/cache/conftool/dbconfig/20260504-134048-fceratto.json
* 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92210 and previous config saved to /var/cache/conftool/dbconfig/20260504-133039-fceratto.json
* 13:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 13:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92209 and previous config saved to /var/cache/conftool/dbconfig/20260504-133010-fceratto.json
* 13:29 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:23 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92208 and previous config saved to /var/cache/conftool/dbconfig/20260504-132002-fceratto.json
* 13:13 moritzm: installing jaraco.context security updates
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5004.eqsin.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5004.eqsin.wmnet with OS bookworm
* 13:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P92207 and previous config saved to /var/cache/conftool/dbconfig/20260504-130953-fceratto.json
* 12:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92206 and previous config saved to /var/cache/conftool/dbconfig/20260504-125945-fceratto.json
* 12:59 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 12:59 dcausse: [[phab:T425301|T425301]]: resuming writes on cloudelastic
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92205 and previous config saved to /var/cache/conftool/dbconfig/20260504-125247-fceratto.json
* 12:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92204 and previous config saved to /var/cache/conftool/dbconfig/20260504-125219-fceratto.json
* 12:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5004.eqsin.wmnet with reason: host reimage
* 12:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92203 and previous config saved to /var/cache/conftool/dbconfig/20260504-124210-fceratto.json
* 12:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P92202 and previous config saved to /var/cache/conftool/dbconfig/20260504-123203-fceratto.json
* 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92201 and previous config saved to /var/cache/conftool/dbconfig/20260504-122155-fceratto.json
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92200 and previous config saved to /var/cache/conftool/dbconfig/20260504-121441-fceratto.json
* 12:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92199 and previous config saved to /var/cache/conftool/dbconfig/20260504-121424-fceratto.json
* 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92198 and previous config saved to /var/cache/conftool/dbconfig/20260504-120416-fceratto.json
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5004.eqsin.wmnet with OS bookworm
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5004.eqsin.wmnet on all recursors
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P92197 and previous config saved to /var/cache/conftool/dbconfig/20260504-115408-fceratto.json
* 11:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5004.eqsin.wmnet - jmm@cumin2002"
* 11:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:47 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5004.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5003.eqsin.wmnet
* 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum5003.eqsin.wmnet with OS bookworm
* 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92196 and previous config saved to /var/cache/conftool/dbconfig/20260504-114400-fceratto.json
* 11:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92195 and previous config saved to /var/cache/conftool/dbconfig/20260504-113620-fceratto.json
* 11:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92194 and previous config saved to /var/cache/conftool/dbconfig/20260504-113550-fceratto.json
* 11:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1162: after reimage to trixie
* 11:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5003.eqsin.wmnet with reason: host reimage
* 11:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92192 and previous config saved to /var/cache/conftool/dbconfig/20260504-112542-fceratto.json
* 11:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P92191 and previous config saved to /var/cache/conftool/dbconfig/20260504-111534-fceratto.json
* 11:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92189 and previous config saved to /var/cache/conftool/dbconfig/20260504-110526-fceratto.json
* 11:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2187: repool after maintenance
* 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92187 and previous config saved to /var/cache/conftool/dbconfig/20260504-105808-fceratto.json
* 10:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92186 and previous config saved to /var/cache/conftool/dbconfig/20260504-105739-fceratto.json
* 10:48 moritzm: installing bash updates from trixie point release
* 10:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92184 and previous config saved to /var/cache/conftool/dbconfig/20260504-104731-fceratto.json
* 10:42 moritzm: installing postgresql-17 security updates
* 10:42 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1162: after reimage to trixie
* 10:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS trixie
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum5003.eqsin.wmnet with OS bookworm
* 10:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P92181 and previous config saved to /var/cache/conftool/dbconfig/20260504-103723-fceratto.json
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:35 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum5003.eqsin.wmnet on all recursors
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum5003.eqsin.wmnet - jmm@cumin2002"
* 10:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92179 and previous config saved to /var/cache/conftool/dbconfig/20260504-102715-fceratto.json
* 10:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum5003.eqsin.wmnet
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92178 and previous config saved to /var/cache/conftool/dbconfig/20260504-101855-fceratto.json
* 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92177 and previous config saved to /var/cache/conftool/dbconfig/20260504-101826-fceratto.json
* 10:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2187: repool after maintenance
* 10:16 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:15 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1162.eqiad.wmnet with reason: host reimage
* 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92174 and previous config saved to /var/cache/conftool/dbconfig/20260504-100818-fceratto.json
* 10:02 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS trixie
* 10:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1162: Reimage to Trixie
* 10:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1162.eqiad.wmnet with reason: Reimage to Trixie
* 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P92172 and previous config saved to /var/cache/conftool/dbconfig/20260504-095810-fceratto.json
* 09:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5005.wikimedia.org
* 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92171 and previous config saved to /var/cache/conftool/dbconfig/20260504-094802-fceratto.json
* 09:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5005.wikimedia.org
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92170 and previous config saved to /var/cache/conftool/dbconfig/20260504-093938-fceratto.json
* 09:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92169 and previous config saved to /var/cache/conftool/dbconfig/20260504-093910-fceratto.json
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 09:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1182: after reimage to trixie
* 09:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92167 and previous config saved to /var/cache/conftool/dbconfig/20260504-092902-fceratto.json
* 09:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P92165 and previous config saved to /var/cache/conftool/dbconfig/20260504-091853-fceratto.json
* 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2187: Fixing events
* 09:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2187: Fixing events
* 09:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2187.codfw.wmnet with reason: Checking events
* 09:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92163 and previous config saved to /var/cache/conftool/dbconfig/20260504-090845-fceratto.json
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92161 and previous config saved to /var/cache/conftool/dbconfig/20260504-085930-fceratto.json
* 08:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92160 and previous config saved to /var/cache/conftool/dbconfig/20260504-085912-fceratto.json
* 08:56 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1182: after reimage to trixie
* 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92158 and previous config saved to /var/cache/conftool/dbconfig/20260504-084904-fceratto.json
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1008.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1007.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1006.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1005.eqiad.wmnet
* 08:43 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1004.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1003.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-worker1001.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1002.eqiad.wmnet
* 08:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: tools-k8s-ctrl1001.eqiad.wmnet
* 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P92157 and previous config saved to /var/cache/conftool/dbconfig/20260504-083857-fceratto.json
* 08:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS trixie
* 08:32 moritzm: installing Linux 5.10.251-3 on bullseye hosts
* 08:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92156 and previous config saved to /var/cache/conftool/dbconfig/20260504-082849-fceratto.json
* 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
* 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T419961|T419961]])', diff saved to https://phabricator.wikimedia.org/P92155 and previous config saved to /var/cache/conftool/dbconfig/20260504-082024-fceratto.json
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 08:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 08:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
* 08:15 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:11 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1182.eqiad.wmnet with reason: host reimage
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 08:04 gkyziridis@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: sync
* 08:04 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] (duration: 07m 58s)
* 08:03 gkyziridis@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: sync
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 08:02 gkyziridis@deploy1003: helmfile [staging] DONE helmfile.d/services/eventstreams: sync
* 08:02 gkyziridis@deploy1003: helmfile [staging] START helmfile.d/services/eventstreams: sync
* 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 08:01 moritzm: installing Linux 6.1.170 on bookworm hosts
* 07:59 urbanecm@deploy1003: urbanecm, h2o: Continuing with deployment
* 07:57 urbanecm@deploy1003: urbanecm, h2o: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1212: after reimage to trixie
* 07:56 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1277256{{!}}Add sva to wmgExtraLanguageNames (T407106)]]
* 07:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 07:55 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS trixie
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:51 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:47 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1182: Reimage to Trixie
* 07:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1182: Reimage to Trixie
* 07:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1182.eqiad.wmnet with reason: Reimage to Trixie
* 07:44 dcausse: [[phab:T425301|T425301]]: stopping writes on cloudelastic
* 07:44 dcausse@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 07:44 dcausse@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2147.codfw.wmnet
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:42 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:42 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2147.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
* 07:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2149: after reimage to trixie
* 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1188: after reimage to trixie
* 07:38 moritzm: installing Linux 6.12.85 on trixie hosts
* 07:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo2003.codfw.wmnet
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo2003.codfw.wmnet
* 07:33 marostegui@cumin1003: START - Cookbook sre.dns.netbox
* 07:28 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2147.codfw.wmnet
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1212: after reimage to trixie
* 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1212.eqiad.wmnet with OS trixie
* 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2149: after reimage to trixie
* 06:55 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1188: after reimage to trixie
* 06:52 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS trixie
* 06:47 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS trixie
* 06:43 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:37 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1212.eqiad.wmnet with reason: host reimage
* 06:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:25 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:21 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1212.eqiad.wmnet with OS trixie
* 06:19 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage
* 06:17 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2149.codfw.wmnet with reason: host reimage
* 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1212: Reimage to Trixie
* 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1212.eqiad.wmnet with reason: Reimage to Trixie
* 06:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 13 hosts with reason: Sanitarium s3 master: reimage to Debian Trixie
* 06:09 marostegui: Reimage sanitarium master for s3, lag to be expected on wikireplicas for s3 [[phab:T424792|T424792]]
* 06:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS trixie
* 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1188: Reimage to Trixie
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1188.eqiad.wmnet with reason: Reimage to Trixie
* 05:57 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS trixie
* 05:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2149: Reimage to Trixie
* 05:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2149: Reimage to Trixie
* 05:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2149.codfw.wmnet with reason: Reimage to Trixie
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 36s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-05-03 ==
* 14:11 ladsgroup@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] (duration: 10m 51s)
* 14:05 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
* 14:04 ladsgroup@deploy1003: ladsgroup: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 ladsgroup@deploy1003: Started scap sync-world: Backport for [[gerrit:1281991{{!}}Disable uploads in scnwiki (T425278)]]
* 12:27 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]] (duration: 29m 22s)
* 11:58 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281963{{!}}Remove Wikinews from installer's default main page]]
== 2026-05-02 ==
* 23:32 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] (duration: 06m 41s)
* 23:28 zabe@deploy1003: dreamyjazz, zabe: Continuing with deployment
* 23:27 zabe@deploy1003: dreamyjazz, zabe: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:26 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281775{{!}}Uninstall DynamicPageList from wikis it's not used on (T425202)]]
* 23:22 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] (duration: 07m 27s)
* 23:18 zabe@deploy1003: zabe, dreamyjazz: Continuing with deployment
* 23:17 zabe@deploy1003: zabe, dreamyjazz: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:15 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1281739{{!}}Uninstall DynamicPageList from officewiki (T425154)]]
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
* 18:07 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
* 18:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:59 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2369.codfw.wmnet with OS trixie
* 17:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2369.codfw.wmnet with reason: host reimage
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2374.codfw.wmnet with OS trixie
* 17:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2373.codfw.wmnet with OS trixie
* 17:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:09 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2372.codfw.wmnet with OS trixie
* 17:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 17:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2370.codfw.wmnet with OS trixie
* 17:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 16:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:44 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2374.codfw.wmnet with reason: host reimage
* 16:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2373.codfw.wmnet with reason: host reimage
* 16:42 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2372.codfw.wmnet with reason: host reimage
* 16:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2371.codfw.wmnet with reason: host reimage
* 16:31 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2374.codfw.wmnet with OS trixie
* 16:30 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2373.codfw.wmnet with OS trixie
* 16:29 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2372.codfw.wmnet with OS trixie
* 16:28 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2370.codfw.wmnet with reason: host reimage
* 16:20 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2371.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2370.codfw.wmnet with OS trixie
* 16:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2369.codfw.wmnet with OS trixie
* 16:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 16:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 15:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 15:38 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 15:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 15:25 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 15:24 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 15:23 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 12:02 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] (duration: 13m 06s)
* 11:57 samtar@deploy1003: samtar: Continuing with deployment
* 11:50 samtar@deploy1003: samtar: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:49 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281747{{!}}Watchlist star: Revert popover/dialog changes (T425185)]]
* 09:20 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 09:19 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2366.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2368.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2367.codfw.wmnet with OS trixie
* 02:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 31s)
* 02:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2368.codfw.wmnet with reason: host reimage
* 01:50 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2367.codfw.wmnet with reason: host reimage
* 01:49 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2366.codfw.wmnet with reason: host reimage
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2368.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2367.codfw.wmnet with OS trixie
* 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2366.codfw.wmnet with OS trixie
* 01:35 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2364.codfw.wmnet with OS trixie
* 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2365.codfw.wmnet with OS trixie
* 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:23 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2363.codfw.wmnet with OS trixie
* 01:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 01:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 01:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2365.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2364.codfw.wmnet with reason: host reimage
* 00:57 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2363.codfw.wmnet with reason: host reimage
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2365.codfw.wmnet with OS trixie
* 00:45 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2364.codfw.wmnet with OS trixie
* 00:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2363.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2362.codfw.wmnet with OS trixie
* 00:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2361.codfw.wmnet with OS trixie
* 00:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2360.codfw.wmnet with OS trixie
* 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 00:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
== 2026-05-01 ==
* 23:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2362.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2361.codfw.wmnet with reason: host reimage
* 23:39 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2360.codfw.wmnet with reason: host reimage
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2362.codfw.wmnet with OS trixie
* 23:27 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2361.codfw.wmnet with OS trixie
* 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2360.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2357.codfw.wmnet with OS trixie
* 23:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:25 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2359.codfw.wmnet with OS trixie
* 23:22 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:22 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2358.codfw.wmnet with OS trixie
* 23:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 23:08 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 23:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 23:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2359.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2357.codfw.wmnet with reason: host reimage
* 22:56 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2358.codfw.wmnet with reason: host reimage
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2359.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2358.codfw.wmnet with OS trixie
* 22:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-worker2357.codfw.wmnet with OS trixie
* 22:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:23 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:19 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2374.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2373.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2372.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 22:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:57 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:50 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2371.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:49 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2370.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2369.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2368.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2367.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2366.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2365.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2364.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2363.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2362.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:19 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2361.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2360.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:10 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2359.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2358.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:06 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-worker2357.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2374
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2374
* 21:04 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2373
* 21:04 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2373
* 20:59 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2372
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2372
* 20:58 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2371
* 20:58 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2371
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2370
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2370
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2369
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2369
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2368
* 20:57 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2368
* 20:57 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2367
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2367
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2366
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2366
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2365
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2365
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2364
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2364
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2363
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2363
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2362
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2362
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2361
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2361
* 20:55 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2360
* 20:55 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2360
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2359
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2359
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2358
* 20:54 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2358
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2357
* 20:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2357
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:53 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding wikikube-worker2357 to codfw - jhancock@cumin2002"
* 20:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2002.codfw.wmnet with OS trixie
* 20:06 krinkle@deploy1003: Finished scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] (duration: 15m 27s)
* 20:02 krinkle@deploy1003: krinkle: Continuing with deployment
* 19:54 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:52 krinkle@deploy1003: krinkle: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:51 krinkle@deploy1003: Started scap sync-world: Backport for [[gerrit:1269440{{!}}Enable wgTrackMediaRequestProvenance on wikidata.org (T414338)]], [[gerrit:1269441{{!}}Enable wgTrackMediaRequestProvenance on Commons (T414338)]]
* 19:49 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2002.codfw.wmnet with reason: host reimage
* 19:40 dancy@deploy1003: Finished scap sync-world: testing [[phab:T317405|T317405]] (duration: 03m 23s)
* 19:37 dancy@deploy1003: Started scap sync-world: testing [[phab:T317405|T317405]]
* 19:36 dancy@deploy1003: Installation of scap version "4.259.0" completed for 2 hosts
* 19:34 dancy@deploy1003: Installing scap version "4.259.0" for 2 host(s)
* 18:55 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 18:55 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'sync'.
* 18:43 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Alangi Derick out of all services on: 2442 hosts
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2002
* 18:41 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2002
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2002.codfw.wmnet 50.16.192.10.in-addr.arpa 0.5.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2002 - herron@cumin1003"
* 18:36 herron@cumin1003: START - Cookbook sre.dns.netbox
* 18:33 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2002
* 18:32 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2002.codfw.wmnet with OS trixie
* 18:26 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS trixie
* 18:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 18:00 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2003
* 17:41 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2003
* 17:40 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2003
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2003.codfw.wmnet 24.32.192.10.in-addr.arpa 4.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:40 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:40 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2003 - herron@cumin1003"
* 17:33 herron@cumin1003: START - Cookbook sre.dns.netbox
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2003
* 17:28 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS trixie
* 17:15 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2004.codfw.wmnet with OS trixie
* 16:34 cdobbins@cumin2002: conftool action : get/pooled; selector: name=cp5024.eqsin.wmnet
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 16:30 ebernhardson@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 16:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 16:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 16:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 15:59 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 15:47 dancy@deploy1003: Installation of scap version "4.258.1" completed for 2 hosts
* 15:45 dancy@deploy1003: Installing scap version "4.258.1" for 2 host(s)
* 15:34 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:30 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2004.codfw.wmnet with reason: host reimage
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host kafka-logging2004
* 15:14 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
* 15:11 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: START - Cookbook sre.dns.wipe-cache kafka-logging2004.codfw.wmnet 38.16.192.10.in-addr.arpa 8.3.0.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:11 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:11 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host kafka-logging2004 - herron@cumin1003"
* 15:05 dancy@deploy1003: Installation of scap version "4.258.0" completed for 2 hosts
* 15:03 dancy@deploy1003: Installing scap version "4.258.0" for 2 host(s)
* 14:57 herron@cumin1003: START - Cookbook sre.dns.netbox
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host kafka-logging2004
* 14:47 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS trixie
* 13:45 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 13:44 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 13:24 _Gerges: WikiMonitor setup
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1080
* 13:09 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1078
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1079
* 13:09 jclark@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host cloudvirt1077
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1080
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1079
* 13:09 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1078
* 13:08 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1077
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1079.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1078.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1077.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host cloudvirt1080.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding cloudvirt1077 to eqiad - jclark@cumin1003"
* 13:00 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 12:34 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
* 12:34 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
* 12:33 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
* 09:57 samtar@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] (duration: 06m 49s)
* 09:53 samtar@deploy1003: samtar: Continuing with deployment
* 09:52 samtar@deploy1003: samtar: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:50 samtar@deploy1003: Started scap sync-world: Backport for [[gerrit:1281423{{!}}Switch watchstar from Popover to Dialog (T417847)]]
* 09:38 urbanecm@deploy1003: Finished scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]] (duration: 06m 05s)
* 09:32 urbanecm@deploy1003: Started scap sync-world: Backport for [[gerrit:1281426{{!}}Update the interwiki cache (T239173)]]
* 08:13 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 08:13 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 08:12 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 03:26 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 41s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 00:16 zabe@deploy1003: Finished scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] (duration: 07m 05s)
* 00:13 zabe@deploy1003: zabe: Continuing with deployment
* 00:11 zabe@deploy1003: zabe: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:09 zabe@deploy1003: Started scap sync-world: Backport for [[gerrit:1280417{{!}}Add script to fix fr_deleted drifts (T424553)]]
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
e4lv9jn8udgyrn1alfi2y65e2crfxp7
Obsolete:Analytics/Archive/EventLogging/Outages
110
15224
2421375
2259952
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Outages]] to [[Obsolete:Analytics/Archive/EventLogging/Outages]]: pages are obsolete
2201887
wikitext
text/x-wiki
EventLogging outages are recorded along with ops incidents on the Wikitech page [[Incident documentation]]. Look for incidents with "EventLogging" in the title. Alternatively, look for this category: [[:Category:EventLogging/Incident documentation]].
[[Category:Data platform]]
[[Category:Data platform systems]]
9u33dstt9424yfy31ulsh22407xkbba
Event logging/Operations/Outages
0
15225
2421499
2266563
2026-05-31T09:23:01Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Outages]]
2421499
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster
110
19133
2421387
2259964
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/TestingOnBetaCluster]] to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]: pages are obsolete
2201893
wikitext
text/x-wiki
{{Notice|This documentation is outdated. See [[Event_Platform/Instrumentation_How_To]].}}
The consumer side of event logging can be easily tested on Beta Cluster.
== Instance ==
The instance name is configured here: https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/CommonSettings-labs.php
Note that this might change at any time but other than the instance the rest of the info on this document should apply regardless of the instance
Note that you need <code>sudo</code> on this instance to see logs, any user trying to test stuff on Beta Cluster should ask for sudo on <code>deployment-eventlog08</code>.
It is unfortunate that sudo is required but that is the state of affairs right now.
== How to create test events ==
=== How to log a client-side event to Beta Cluster directly ===
Just hit the varnish endpoint on labs for example:
curl -A "WikipediaApp/22.0.22-alpha-2017-11-01 (Android 8.0.0; Phone) Alpha Channel" https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22pageTitleSource%22%3A%22Main%20Page%22%2C%22namespaceIdSource%22%3A0%2C%22pageIdSource%22%3A1%2C%22isAnon%22%3Atrue%2C%22popupEnabled%22%3Atrue%2C%22pageToken%22%3A%223ec574813fadb97b%22%2C%22sessionToken%22%3A%228460c2e4d547b250%22%2C%22previewCountBucket%22%3A%220%20previews%22%2C%22hovercardsSuppressedByGadget%22%3Afalse%2C%22action%22%3A%22pageLoaded%22%7D%2C%22revision%22%3A16364296%2C%22schema%22%3A%22Popups%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D
https://en.wikipedia.beta.wmflabs.org/beacon/event?%7B%22event%22%3A%7B%22country%22%3A%22US%22%2C%22region%22%3A%22WA%22%2C%22anonymous%22%3Atrue%2C%22project%22%3A%22wikipedia%22%2C%22db%22%3A%22enwiki%22%2C%22uselang%22%3A%22en%22%2C%22device%22%3A%22desktop%22%2C%22debug%22%3Afalse%2C%22randomcampaign%22%3A0.8838892205730462%2C%22randombanner%22%3A0.7340400211496478%2C%22recordImpressionSampleRate%22%3A0.01%2C%22impressionEventSampleRate%22%3A1%2C%22status%22%3A%22banner_shown%22%2C%22statusCode%22%3A%226%22%2C%22campaign%22%3A%22CN%20browser%20tests%22%2C%22campaignCategory%22%3A%22CNbrowsertests%22%2C%22campaignCategoryUsesLegacy%22%3Afalse%2C%22bucket%22%3A0%2C%22banner%22%3A%22browser_test_b3%22%2C%22bannerCategory%22%3A%22CNbrowsertests%22%2C%22result%22%3A%22show%22%2C%22testIdentifiers%22%3A%22popupsUnknown%22%7D%2C%22revision%22%3A17995347%2C%22schema%22%3A%22CentralNoticeImpression%22%2C%22webHost%22%3A%22en.wikipedia.beta.wmflabs.org%22%2C%22wiki%22%3A%22enwiki%22%7D;
=== How to log via the website ===
Use http://en.m.wikipedia.beta.wmflabs.org/wiki/Main_Page to create events in mobile, for example.
=== How to load test with a bunch of events ===
There's a script that may be handy. It's in the same eventlogging codebase:
https://github.com/wikimedia/eventlogging/blob/master/bin/eventlogging-load-tester
== How to verify events ==
You can tail the files in the <code>/srv/log/eventlogging</code> on <code>deployment-eventlog08.deployment-prep.eqiad1.wikimedia.cloud</code> to verify if your event is coming through.
Unless noted otherwise, the files mentioned in this section and the subsections are in this directory.
ssh deployment-eventlog08.deployment-prep.eqiad1.wikimedia.cloud
cd /srv/log/eventlogging
=== Validated events ===
==== In MySQL ====
All events in beta should be written to the MySQL <tt>log</tt> database hosted on the beta eventlogging server.
ssh deployment-eventlog08.deployment-prep.eqiad1.wikimedia.cloud
sudo mysql --skip-ssl log
show tables;
...
==== In files ====
* <code>all-events.log</code>: schema-validated events that are inserted into MYSQL appear in this file (the all* in name is missleading).
Tail this file while you use the website and emit server or client side events. If your events are valid your events should be there after a short while (seconds). The contents of this file are of the eventlogging_valid_mixed topic in Kafka.
==== In Kafka ====
You can consume valid events directly from kafka:
kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'
will list topics .
After, consume from your topic. It should be named something like eventlogging_<schema>. You should be able to see your events if they are valid.
kafkacat -C -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 -t eventlogging_NavigationTiming
=== Raw stream of events (including unvalidated events) ===
* <code>client-side-events.log</code>: client side events appear in this file (valid and not)
If events do not appear they might not be valid, check <code>/srv/log/eventlogging/systemd</code> and <code>tail -f</code> + <code>grep</code> the following:
eventlogging-processor@client-side-XX.log
Validation errors will appear on those logs and they are very descriptive. '''Note''': you may see a <code>-00</code> log and a <code>-01</code> log, which exist for parallelization and you should monitor both.
=== Where is eventlogging code? ===
/srv/deployment/eventlogging/analytics/eventlogging
=== See all EventLogging schema Kafka topics ===
kafkacat -L -b deployment-kafka-jumbo-1.deployment-prep.eqiad1.wikimedia.cloud:9092 | grep 'topic "eventlogging_' | awk -F '"' '{print $2}'
All event logging topics for which valid events are being sent should be present here
== Database ==
In order to see events you can use the eventlogging user whose user and password are listed at:
/etc/eventlogging.d/consumers/mysql-m4-master
If you have sudo on the machine the mysql password for the root user is 'secret', otherwise:
mysql -h 127.0.0.1 --user=eventlogging --password=68QrOq220717816UycU1 --skip-ssl
(it's labs, the password is not really a secret.)
If mysql needs a re-start:
systemctl restart mysql
The mysql setup on beta leaves much to be desired, if mysql does start check /var/log/mysql.err
This might be of help: [http://www.chriscalender.com/disabling-transparent-hugepages-for-tokudb/]
Please also keep in mind that the eventlogging_cleaner.py script runs periodically to purge/sanitize records according to its whitelist as it happens in production. Some useful commands:<syntaxhighlight lang="bash">
elukey@deployment-eventlog08:~$ systemctl list-timers | grep sanitization
Wed 2018-10-24 11:00:00 UTC 20h left Tue 2018-10-23 11:00:14 UTC 3h 19min ago eventlogging_db_sanitization.timer eventlogging_db_sanitization.service
elukey@deployment-eventlog08:~$ systemctl cat eventlogging_db_sanitization.service
# /lib/systemd/system/eventlogging_db_sanitization.service
[Unit]
Description=Apply Analytics data retetion policies to the Eventlogging database
[Service]
User=eventlogcleaner
ExecStart=/usr/local/bin/eventlogging_cleaner ...[cut]...
</syntaxhighlight>Two notable things:
* --no-whitelist-sanity-check is not used in production but only in beta.
* The systemd timer updates /srv/eventlogging/eventlogging_cleaner with the timestamp of the next day to start sanitizing from during the next run (that happens once a day).
== Admin ==
=== Give people access ===
Add them to the lists on these wikis (you need to be an admin to do that)
Asking in {{irc|wikimedia-cloud}} might be a way to get help.
[[Nova_Resource:Deployment-prep]]
[[Special:NovaProject]] -> add users to deployment-prep
=== How to deploy code ===
<syntaxhighlight lang="bash">
# Log into the beta deploy server
ssh deployment-tin.deployment-prep.eqiad1.wikimedia.cloud
# cd to the EventLogging analytics deploy source
cd /srv/deployment/eventlogging/analytics
# Deploy using scap3 in the beta environment
scap deploy -e beta
</syntaxhighlight>
You can run puppet with
puppet agent -tv
=== Restart EventLogging ===
Check:
sudo eventloggingctl status
Run:
sudo eventloggingctl restart
Stop completely:
sudo eventloggingctl stop
[[Category:Data platform]]
[[Category:Data platform systems]]
my5deazjm85xttn5browe10oxgok1q1
Obsolete:Analytics/Archive/EventLogging/Backfilling
110
21195
2421351
2381069
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Backfilling]] to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]: pages are obsolete
2381069
wikitext
text/x-wiki
= Backfilling a kafka eventlogging_<Schema> topic =
It is possible that eventlogging_<Schema> topic is missing data (if there was a host issue) when event data is in fact in eventlogging-client-side topic. Since eventlogging-client-side is also imported into Hadoop you can run a local backfill evenlogging-processor on a stat machine (as of recent the eventlogging role is deployed there) and pump the data back in the topics in which it belongs.
* Get files from Hadoop for the time intervals that are missing (note files are compressed but you want them to be in plain text)
hdfs dfs -text /wmf/data/raw/eventlogging_client_side/eventlogging-client-side/hourly/2019/04/02/21/* > 2019-04-21.txt
* Clean up files a bit, they have an extra sequence number:
more 2019-04-01-19.txt | awk '{$1=""; print substr($0,2)}' | awk 'NF' > 2019-04-01-19-plain.txt
* Checkout eventlogging to your homedir and after do:
export PYTHONPATH=/home/nuria/eventlogging
export PATH="$PATH:/home/nuria/eventlogging/bin"
eventlogging-processor --help
* Once you have a file you want to pump to kafka do (careful with quotes, format is picky):
Need to whitelist proxy's
export http_proxy=<nowiki>http://webproxy.eqiad.wmnet:8080</nowiki>
export https_proxy=<nowiki>http://webproxy.eqiad.wmnet:8080</nowiki>
<nowiki>ionice cat 2019-04-01-19-plain.txt | eventlogging-processor '%q %{recvFrom}s %{seqId}d %D %{ip}i %u' 'stdin://' 'kafka-confluent:///kafka-jumbo1001.eqiad.wmnet:9092,kafka-jumbo1002.eqiad.wmnet:9092,kafka-jumbo1003.eqiad.wmnet:9092,kafka-jumbo1004.eqiad.wmnet:9092,kafka-jumbo1005.eqiad.wmnet:9092,kafka-jumbo1006.eqiad.wmnet:9092?topic=eventlogging_{schema}&message.send.max.retries=6,retry.backoff.ms=200' --output-invalid</nowiki>
* After camus runs there should be new files on say this partition
ls -la /mnt/hdfs/wmf/data/raw/eventlogging/eventlogging_VirtualPageView/hourly/2019/04/01/20
* You would need to run refine for data to be refined if backfilling is putting events more than 1 month in the past (an-coord1001):
sudo -u hdfs /usr/local/bin/refine_eventlogging_legacy --since 2019-04-01T19:00:00 --until 2019-04-01T21:00:00
* Also events_sanitized might need to be backfilled
= Get all events from kafka topics for a given timeframe =
We need to get all events from timeframe from kafka, that can be done via consuming N messages from an offset
The following command will print out message and offset
kafkacat -C -b kafka1012.eqiad.wmnet:9092 -c 1 -t eventlogging-client-side -f "%o %s
The following command will return 12 messages at offsite 35...
kafkacat -C -b kafka-jumbo1002.eqiad.wmnet:9092 -t eventlogging-client-side -o 3597866000 -c12
= Backfilling refine =
If an interval of data is missing or corrupted in Hive's event database (eventlogging tables), we can re-run the refine process.
# Ssh into <code>an-launcher1003.eqiad.wmnet</code>
# Execute <code>systemctl list-timers | grep refine_eventlogging</code>
# Choose the timer that corresponds to the data that you want to correct and run <code>systemctl cat refine_eventlogging_<foo></code>
# The results will show you the script that is used to run the EventLogging refine in the ExecStart field, i.e.: <code>ExecStart=/usr/local/bin/refine_eventlogging_analytics</code>.
# Cat it and copy the command. You don't need to copy the <code>is-yarn-app-running</code> part of the command, just the <code>spark2-submit</code> part. You can also leave the <code>"${@}"</code> out.
# You might want to add sudo -u hdfs/analytics in front of it. It's a good idea to change the name as well, like prefixing it with <code>backfill_</code>. Add --since and --until parameters, to override the properties file. If you're not sure if you need to override other properties (like emails_to or table_blacklist_regex, etc.), get the properties file path from the <code>spark2-submit</code> command and cat it.
# Run your tailored command in an-launcher1003.eqiad.wmnet.
= Backfilling sanitization =
If an interval of data is missing, corrupted or out of date in Hive's event_sanitized database, we can re-run the sanitization process.
'''Note that EL sanitization does already a second pass 45 days after data collection. So if the data that you want to backfill is not older than 45 days, you don't need to backfill it (will be done automatically after 45 days), unless it's urgent!''' One nuance to keep in mind is whether the alarms are alerting about a schema that was recently added to the allowlist. If this is the case, alarms could be false positives as RefineSanitize is trying to catch up.
# Ssh into an-launcher1003.eqiad.wmnet
# Execute <code>systemctl cat refine_sanitize_eventlogging_analytics_delayed</code> (the _immediate and _delayed versions of this timer and scripts are identical except for their "since" and "until" parameters in their respective .properties files).
# The results will show you the script that is used to run the EventLogging sanitization in the ExecStart field, i.e.: <code>ExecStart=/usr/local/bin/refine_sanitize_eventlogging_analytics_delayed</code>.
# Cat it and copy the command. You don't need to copy the <code>is-yarn-app-running</code> part of the command, just the <code>spark3-submit</code> part. You can also leave the <code>"${@}"</code> out.
# Run it with sudo -u analytics, and modify the name of the job, like prefix it with <code>backfill_</code>. Add --since and --until parameters, to override the properties file (make sure to use YYYY-MM-DDTHH:mm:ss format). If you're not sure if you need to override other properties (like emails_to or table_blacklist_regex, etc.), get the properties file path from the <code>spark2-submit</code> command and cat it.
# If you're only sanitizing data for one or a few streams, you'll need to update the whitelist.yaml that the properties file points to. Copy the properties file to your directory, change its absolute path in the command. Then change the hdfs absolute path to the <code>whitelist.yaml</code> and copy/modify that file as well. Remember to keep the <code>_defaults</code> section when deleting other sections.
# Run the command. If it starts retrying (switching from RUNNING to ACCEPTED multiple times) you have to let it retry 6 times without killing it, otherwise it won't generate logs and you won't be able to figure out why it failed. Look at logs as usual with yarn logs.
<div class="toccolours mw-collapsible mw-collapsed">
Script example to run RefineSanitizeMonitor & RefineSanitize:
<syntaxhighlight lang="shell" class="mw-collapsible-content">
# on statbox
spark2-submit \
--class org.wikimedia.analytics.refinery.job.refine.RefineSanitizeMonitor \
--master yarn \
--deploy-mode client \
/srv/deployment/analytics/refinery/artifacts/org/wikimedia/analytics/refinery/refinery-job-0.1.15.jar \
--output_database event_sanitized \
--output_path /wmf/data/event_sanitized \
--since "2022-06-19T09:52:00+0000" \
--until "2022-06-19T11:02:00+0000" \
--table_include_regex mediawiki_revision_create \
--allowlist_path /wmf/refinery/current/static_data/sanitization/event_sanitized_main_allowlist.yaml \
--input_database event \
--keep_all_enabled true \
--should_email_report true \
--to_emails <youremail>@wikimedia.org
# on an-launcher1002
sudo -u analytics kerberos-run-command analytics \
spark2-submit \
--class org.wikimedia.analytics.refinery.job.refine.RefineSanitize \
--master yarn \
--deploy-mode client \
/srv/deployment/analytics/refinery/artifacts/org/wikimedia/analytics/refinery/refinery-job-0.1.15.jar \
--output_database event_sanitized \
--output_path /wmf/data/event_sanitized \
--since "2022-06-19T09:52:00+0000" \
--until "2022-06-19T11:02:00+0000" \
--table_include_regex mediawiki_revision_create \
--allowlist_path /wmf/refinery/current/static_data/sanitization/event_sanitized_main_allowlist.yaml \
--input_database event \
--keep_all_enabled true \
--should_email_report true \
--to_emails <youremail>@wikimedia.org \
--salts_path /user/hdfs/salts/event_sanitized
</syntaxhighlight>
</div>
== Prerequirements ==
You need sudo on Beta Cluster to test the backfilling scripts and also sudo on eventlog1002 to do the backfilling for real: [[Analytics/Systems/EventLogging/TestingOnBetaCluster|EventLogging/Testing/BetaCluster]]
This document describes how to do backfilling from "processed" events. If you need to backfill from raw events, like the ones stored on the client side log, additional steps are needed. The idea is the same only that a "process" step needs to be included so raw events can be processed before inserted on db.
Note that from this change onwards: [https://gerrit.wikimedia.org/r/#/c/199957/] eventlog1002 only has logs for the last 30 days so backfilling of an outage should be done as soon as possible
==First step (data preparation)==
In the first step, split the logs for the relevant day into files of 64K lines. This size ensures you don't go over memory issues.
(Having such small files gives good control over what timespan you
want to backfill, and it allows for easy parallelization, speed-up, and fine-control during data injection).
Events can be split with a command like this:
mkdir split && cd split && ionice nice zcat /srv/log/eventlogging/archive/all-events.log-20141114.gz >all-events.log && ionice nice split --lines=64000 all-events.log && rm all-events.log
=== Raw Events ===
If you need to backfill raw events you might find this snippet useful: https://gist.github.com/nuria/e837d16b94c09a4df8a4
raw events logs (client-side and server-side) include a bunch of characters that need to be removed to be processed by the processors.
== Second step (data injection)==
You should test your scripts and code in Beta Cluster before trying this on vanadium.
=== Checkout a separate clone of EventLogging ===
The injection is better done using a separate clone of EventLogging. That way the backfilling is not subjected to interruptions of eventual EventLogging deployments of others, and you can use could use an EventLogging version of your choice.
See for example changes done prior to be able to backfill events 1 by 1 (not batched): [https://gerrit.wikimedia.org/r/#/c/190139/]
To run EventLogging from your local checkout you need to change the python
library search path. So, if you checked out EL code in your home directory,
you would need to tell python where to build it:
cd ~/EventLogging/server
export PYTHONPATH='/home/nuria/backfilling/python'
python ./setup.py develop --install-dir=/home/nuria/backfilling/python
These command build EL to `/home/nuria/backfilling/python`
=== Start a Backfilling Consumer ===
In a simple for loop over those split files (or parts of them in parallel), start a separate EventLogging consumer (that consumes from stdin and writes to m2-master) and pipe the file in. The config for this EventLogging consumer is just a copy of the m2
consumer's config having it's input swapped by the stdin. I would rename this config so when running htop is easy to find the process:
Config looks as follows:
nuria@vanadium:~/backfilling$ more mysql-m2-master-BACKFILLING
stdin://
mysql://some-connection-string?charset=utf8&replace=True
Note that the regular consumer batches events. Using that code as is to backfill is fine if you a are dealing with a total outage. If you have a problem with dropped out events within the event stream you cannot batch insertion. Thus , you might need to do code changes to the consumer to be able to backfill:
I had to do these changes on 201502: https://gerrit.wikimedia.org/r/#/c/190139/
To try whether your changes are working (again, in Beta Cluster)
/usr/bin/python -OO ./python/eventlogging-consumer @/home/nuria/backfilling/mysql-m2-master-BACKFILLING > log-backfill.txt 2>&1
For each of the started consumers (I could only start two without the db falling too much behind), capture stdout and
stderr and exit code to separate (per input) files. This allowed to
easily verify that backfilling did not bail out and correlate log
files with input files.
A simple shell script to loop over files and consume each:
<pre>
#!/bin/bash
fileList=`ls 20150208/x*`
for f in $fileList
do
l="${f##*/}"
ionice nice cat $f | ionice nice /usr/bin/python -OO ./python/eventlogging-consumer @/home/nuria/backfilling/mysql-m2-master-BACKFILLING > log-backfill-${l}.txt 2>&1
rm log-backfill-${l}.txt
done
</pre>
== Monitoring ==
There are two things to monitor: the database and eventlogging hosts. You can monitor eventlogging hosts with htop, the database stats appear here: https://tendril.wikimedia.org/host/view/db1046.eqiad.wmnet/3306
[[Category:Data platform]]
[[Category:Data platform systems]]
bon3rl062xnmg16an6amtbnlu6spu7z
Obsolete:Analytics/Archive/EventLogging/Administration
110
22578
2421347
2259924
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Administration]] to [[Obsolete:Analytics/Archive/EventLogging/Administration]]: pages are obsolete
2201880
wikitext
text/x-wiki
{{Notice|This documentation is outdated. See [[Event_Platform#Event_Platform_documentation_pages|Event Platform documentation]].}}
==Overview==
The following diagram should be a companion of the excellent explanation in https://github.com/wikimedia/eventlogging
[[File:EventLogging.png|800px]]
The diagram has been created with https://www.draw.io/. If you want the source code, please check https://gist.github.com/elukey/975fab2bcf2ea6398fe1
==Important notes==
===Dependent systems===
[[Graphite#statsv|statsv]] is running on hafnium.eqiad,wmnet and it is a daemon responsible to aggregate performance data before sending it to [[Statsd]]. There are metrics in the Even Logging dashboard like https://grafana.wikimedia.org/d/000000505/eventlogging?viewPanel=11 that are counting on this service to work properly. If you observe datapoint loss in the metric please check the status of the statsv service on hafnium and restart it if needed.
===Alarms===
As depicted in the picture we are monitoring:
*Lag between messages landing in Kafka topics and message consumption rate from EventLogging's processes using [https://github.com/linkedin/Burrow Burrow]. The alarm will be triggered as email sent to analytics-alert@.
*Insertion rate to Mysql master from the Consumer processes using Graphite/Icinga ([https://github.com/wikimedia/operations-puppet/blob/production/modules/eventlogging/manifests/monitoring/graphite.pp config file]). You will see alerts in the wikimedia-analytics IRC channel.
*[IN PROGRESS] Replication lag between for MySQL slaves ([[phab:T124306|https://phabricator.wikimedia.org/T124306)]]
====Consumption Lag "Alarms": Burrow====
Alarms for burrow report numbers like
eventlogging-valid-mixed:2 (1454622892611, 199315032, 17) -> (1454622901672, 199315032, 17)
This is
(timestamp, offset, #of messages behind (lag))
*Burrow evaluates lag in length window of 10 offsets by default. We are committing offsets every second, this frequency would make us evaluate lag every 10 secs which seems too frequent so we have changed the length of lag window window to 100 secs.
*Lag is evaluated aprox every couple minutes. (we are fine tuning this)
*docs about burrow lag - https://github.com/linkedin/Burrow/wiki/Consumer-Lag-Evaluation-Rules
An interesting alert use case is the following one:<pre>
Cluster: eqiad
Group: eventlogging-00
Status: ERR
Complete: true
Errors: 1 partitions have problems
Format: Topic:Partition (timestamp, start-offset, start-lag) -> (timestamp, end-offset, end-lag)
eventlogging-client-side:11 (1455812030875, 188196575, 0) -> (1455813022220, 188205237, 0)
</pre>As you can see the end offset is greater than the start one (meanwhile the lag is zero) and the partition status is ERROR. This is due to the current Burrow's rule 4 in https://github.com/linkedin/Burrow/wiki/Consumer-Lag-Evaluation-Rules, that should be read in this way:
''Burrow will alert you when a consumer is so slow that the time elapsed between the last offset committed and now is bigger than the time taken to commit all the offsets belonging the last window.''
This can happen when a Kafka broker goes offline and it is a partition leader. EventLogging will need a bit of time to recognize the problem and request new metadata from Kafka, and hence the related consumer status according to Burrow will look like it is stalled or completely blocked even though it is only a temporary stop.
=====Reseting burrow consumer group monitoring=====
If you happen to change the topic assignment for a consumer group, burrow will continue to think that that group should consume from its previous assigned topics, and report lag if it stops doing so. To reset what topics burrow should monitor for a given consumer group, you should delete the consumer group monitoring from burrow. After deletion, consumer group monitoring will be automatically recreated for new topic partitions that get offset commits.
curl -X DELETE localhost:8000/v2/kafka/<cluster>/consumer/<consumer_group>
e.g.
curl -X DELETE localhost:8000/v2/kafka/eqiad/consumer/eventlogging_consumer_mysql_00
See: https://github.com/linkedin/Burrow/wiki/http-request-remove-consumer-group
==Dumping data via sqoop from eventlogging to hdfs==
We will be archiving large tables that do not need immediate data access at " /wmf/data/archive/eventlogging/Table_name" the archival is a plain sqoop dump of table, nothing else. Avro schemas for tables (autogenerated by sqoop upon import) can be found at /wmf/data/archive/eventlogging/avro-schemas.
Hive tables on top of data are generated on archive database;
===Sqoop===
Of interest: https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_example_invocations
Try that you can connect via scoop using a command with no side effects, for example list tables:
sudo -u hdfs sqoop list-tables --password-file '/user/hdfs/mysql-analytics-research-client-pw.txt' --username research --driver org.mariadb.jdbc.Driver --connect jdbc:mysql://analytics-store.eqiad.wmnet/log
If you cannot sudo as hdfs this should work:
sqoop list-tables --password-file '/user/nuria/mysql-analytics-research-client-pw.txt' --username research --driver org.mariadb.jdbc.Driver --connect jdbc:mysql://dbstore1004.eqiad.wmnet:3313/etwiki
A sample import (to /tmp/PageContentSaveCompleteAvro) to avro file format
time sudo -u hdfs sqoop import --as-avrodatafile --password-file '/user/hdfs/mysql-analytics-research-client-pw.txt' --username research --driver org.mariadb.jdbc.Driver --connect jdbc:mysql://analytics-store.eqiad.wmnet/log --table PageContentSaveComplete_5588433 --columns id,uuid,timestamp,webHost,wiki,event_isAPI,event_isMobile,event_revisionId --target-dir /tmp/PageContentSaveCompleteAvro
Another example (note column names with "." need to be quoted):
time sudo -u hdfs sqoop import --as-avrodatafile --username research --password-file '/user/hdfs/mysql-analytics-research-client-pw.txt' --driver org.mariadb.jdbc.Driver --connect jdbc:mysql://analytics-store.eqiad.wmnet/log --query 'select convert(uuid using utf8) uuid,convert(timestamp using utf8) timestamp, convert(wiki using utf8) wiki, convert(webHost using utf8) webHost, convert(event_action using utf8) event_action,conve rt(`event_action.abort.mechanism` using utf8) `event_action.abort.mechanism`, convert(`event_action.abort.timing` using utf8) `event_action.abort.timing`, convert(`event_action.abort.type` using utf8) `event_action.abort.type`, convert(`event_action.init.mechanism` using utf8) `event_action.init.mechanism`, convert(`event_action.init.timing` using utf8) `event_action.init.timing` , convert(`event_ action.init.type` using utf8) `event_action.init.type`, convert(`event_action.ready.timing` using utf8) `event_action.ready.timing`, convert(`event_action.saveAttempt.timing` using utf8) `event_action. saveAttempt.timing`, convert(`event_action.saveFailure.message` using utf8) `event_action.saveFailure.message`, convert(`event_action.saveFailure.timing` using utf8) `event_action.saveFailure.timing` , convert(`event_action.saveFailure.type` using utf8) `event_action.saveFailure.type` , convert(`event_action.saveIntent.timing` using utf8) `event_action.saveIntent.timing`, convert(`event_action.sa veSuccess.timing` using utf8) `event_action.saveSuccess.timing`, convert(`event_editingSessionId` using utf8) `event_editingSessionId`, convert(event_editor using utf8) event_editor, convert(event_int egration using utf8) event_integration, convert(`event_mediawiki.version` using utf8) `event_mediawiki.version`, convert(`event_page.id` using utf8) `event_page.id`, convert(`event_page.ns` using utf8) `event_page.ns`, convert(`event_page.revid` using utf8) `event_page.revid`, convert(`event_page.title` using utf8) `event_page.title`, convert(event_platform using utf8) event_platform, convert(`event _user.class` using utf8) `event_user.class`, convert(`event_user.editCount` using utf8) `event_user.editCount`, convert(`event_user.id` using utf8) `event_user.id`, convert(event_version using utf8) event_version from Edit_13457736_15423246 where $CONDITIONS ' --target-dir /wmf/data/archive/eventlogging/Edit_13457736_15423246 --split-by uuid
Remapping columns and using a custom query:
time sudo -u hdfs sqoop import --as-avrodatafile --password-file '/user/hdfs/mysql-analytics-research-client-pw.txt' --username research --driver org.mariadb.jdbc.Driver --connect jdbc:mysql://analytics-store.eqiad.wmnet/log --query 'select id,uuid,convert(timestamp using utf8) timestamp,convert(webHost using utf8) webhost,wiki,cast(event_isAPI as Integer) event_isAPI,cast(event_isMobile as Integer) event_isMobile,cast(event_revisionId as Integer) event_revisionId from PageContentSaveComplete_5588433 where $CONDITIONS' --map-column-java event_isAPI=Integer,event_isMobile=Integer,event_revisionId=Integer --target-dir /tmp/PageContentSaveCompleteAvro --split-by id
An example of how to set up a table on top of avro files can be found here: https://github.com/wikimedia/analytics-refinery/blob/master/hive/mediawiki/history/create_mediawiki_page_table.hql
I (Nuria) could not get the direct mapping on hive to work, rather I had to use a syntax that included the schema to create the table: https://gist.github.com/nuria/acd67dc1d237c59a2dda9799e82da4c3#file-create-avro-table-with-schema-hql
See also: https://phabricator.wikimedia.org/diffusion/AWCM/browse/master/WDCM_Sqoop_Clients.R
==EventLogging Routine Maintenance for the oncall==
*Check [https://grafana.wikimedia.org/dashboard/db/EventLogging grafana] for problems with raw vs. validated events, or other apparent problems
*Check errors on logstash: https://logstash.wikimedia.org
*Check storage for any gaps if you think there might be an issue: A few different scripts exist, [https://gist.github.com/milimetric/b3f3d34d8d6a77f28463 Milimetric's gist] for example.
*Decide on whether we need to deploy on that week, avoid Friday deployments
*Remember to log all actions to SAL log (!log <something> on ops channel)
*Report outages as part of wikimedia's incident reports so there is a reference
*Follow up on any alarms that might be raised
==How Tos==
===Restart all EventLogging processes===
Check:
sudo eventloggingctl status
Run:
sudo eventloggingctl restart
Stop completely:
sudo eventloggingctl stop
The config applied to create logs and such is at:
/etc/eventlogging.d/*/*
===Start/Stop/Restart individual EventLogging processes===
EventLogging processes are managed by systemd. Each config file in <tt>/etc/eventlogging.d/*/*</tt> corresponds to a single eventlogging process. Let's call the pieces of this hierarchy <tt>/etc/eventlogging.d/$service/$name</tt>.
To stop one of them, you can do something like:
systemctl stop eventlogging-$service@$name
For example:<syntaxhighlight lang="bash">
elukey@eventlog1002:~$ sudo systemctl status eventlogging-consumer@mysql-m4-master-00.service
● eventlogging-consumer@mysql-m4-master-00.service - Eventlogging Consumer mysql-m4-master-00
Loaded: loaded (/lib/systemd/system/eventlogging-consumer@mysql-m4-master-00.service; enabled; vendor preset: enabled)
Active: active (running) since Thu 2018-03-08 13:32:20 UTC; 5 days ago
Main PID: 7882 (python)
Tasks: 2 (limit: 4915)
CGroup: /system.slice/system-eventlogging\x2dconsumer.slice/eventlogging-consumer@mysql-m4-master-00.service
└─7882 python /srv/deployment/eventlogging/analytics/bin/eventlogging-consumer @/etc/eventlogging.d/consumers/mysql-m4-master-00
elukey@eventlog1002:~$ sudo systemctl restart eventlogging-consumer@mysql-m4-master-00.service
</syntaxhighlight>You can stop, start, and restart any individual EventLogging process using variations of this command.
===Backfilling===
[[Analytics/EventLogging/Backfilling]]
===Check logs and errors===
Raw logs are at:
/srv/log/eventlogging
Process logs are at:
/srv/log/eventlogging/systemd
While errors appear on logs is almost easier to check those by consuming from kafka error topic:
> kafkacat -C -b kafka-jumbo1001.eqiad.wmnet -t eventlogging_EventError
Using the above with some sed/sort/unique you can easily grasp schema in error distribution (change the <code>-o 10000</code> to the last number of kafka messages you want to analyze):
> kafkacat -C -b kafka-jumbo1001.eqiad.wmnet -t eventlogging_EventError -o -10000 -e | sed -n 's/^.*"schema": "\([^"]*\)"}.*$/\1/p' | sort | uniq -c
===Troubleshoot events coming in in real time===
*Incoming events counts are logged to graphite, both the count of validating and non validating events per schema are available
using those users can get a sense of change, graphite is populated real-time and if all of a sudden events for an schema do not validate
it is clearly visible.
*EvenLogging slave database (for users with access to 'research' user) is also populated real-time.
Lastly, Event Logging events coming on real time are written to files that are sync-ed to stat1007 and stat1006 once a day, these files can be found here:
~@stat1006:/srv/log/eventlogging/archive$
If you detect an issue or suspicious change , please notify analytics@ and escalate with analytics devs.
===Troubleshoot insufficient permission===
"error: insufficient permission for adding an object to repository database .git/objects"
List > groups to see if you are on wikidev group, if so likely some files on .git directory are not writable by "wikidev" group. Make them so.
===Deploy EventLogging===
EventLogging is deployed using [https://doc.wikimedia.org/mw-tools-scap/scap3/index.html scap3]. The scap deployment configuration for various EventLogging deployments can be found in specific scap repos in gerrit: <tt>eventlogging/scap/<deployment-name></tt>. The EventLogging Analytics deployment scap configs are at at [https://github.com/wikimedia/eventlogging-scap-analytics eventlogging/scap/analytics].
Deployment on [[deployment.eqiad.wmnet]] using
<syntaxhighlight lang="bash">
# ssh to production deploy server
ssh deployment.eqiad.wmnet
# cd to the EventLogging Analytics instance deploy source
cd /srv/deployment/eventlogging/analytics
# Checkout the revision you want to deploy
git pull
# Update the submodules
git submodule update --init
# Run scap3 deployment
deploy
</syntaxhighlight>
ssh eventlog1002.eqiad.wmnet (or wherever eventlogging is deployed.)
Go to /srv/deployment/eventlogging/analytics
See that checkout is there from what you just pulled in from tin (via git log).
Restart EL on target host (eventlog1002.eqiad.wmnet)
eventloggingctl stop
eventloggingctl start
Check various logs in /srv/log/eventlogging/systemd/eventlogging_* to see that things are running as they should.
Check that /srv/log/eventlogging/all-events.log has data flowing in.
Hop in the Ops IRC channel and !log that you upgraded & restarted EventLogging and add the commit hash that you deployed.
!log Redeployed eventlogging with revert to batch/queue size change - https://gerrit.wikimedia.org/r/#/c/258384/
Now please deploy latest code to Beta Cluster to keep things in sync: [[EventLogging/Testing/BetaLabs#How_to_deploy_code]]
===Blacklist a schema===
Push a change to Puppet like this:
https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/516644/
===Resolve lagging replication between MySQL Master and Slaves===
If replication is lagging we should open a ticket with the info we have and tag it with phabricator DBA task so DBA gets ping-ed. An example of a prior problem: [https://phabricator.wikimedia.org/T123634#1934482]
Have in mind that while monitoring for database is here: [https://tendril.wikimedia.org/host/view/dbstore1002.eqiad.wmnet/3306], lag reported does not apply as replication on EL doesn't go through regular channels.
Ad hoc replication script is here:
https://github.com/wikimedia/operations-software/blob/master/dbtools/eventlogging_sync.sh
===Raise log verbosity to debug===
Sometimes while investigating an outage it is handy to raise log's verbosity to DEBUG to have more information about what it is happening. There are three main kind of logs:
*Eventlogging ones (i.e. the ones logged from EL itself)
*Kafka Python ones (library used to consume events from Kafka)
*Confluent Kafka Python ones (library used to produce events to Kafka via librdkafka)
For the first two it is sufficient to do the following (requires root permissions):
*Identify the Eventlogging daemon that you want to debug (processor 01 for example) and get its systemd unit path via <code>systemctl cat eventlogging-processor@client-side-01.service</code> (it will be the first line reported by the command).
*Open the file with an editor and add a line with Environment=LOG_LEVEL=DEBUG
*Restart the daemon via <code>systemctl restart eventlogging-processor@client-side-01.service</code>
For Confluent Kafka Python you'll need to add a specific parameter to the Eventlogging configuration files, since the library needs to configure librdkafka accordingly (where all the useful logs will come from). The procedure is the following:
*Identify the Eventlogging daemon that you want to debug (processor 01 for example)
*Open its configuration file with an editor (in this case /etc/eventlogging.d/processors/client-side-01)
*Identify lines starting with <code>kafka-confluent:///</code> and append to the end of the line debug=something (with something picked from [https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md librdkafka's configuration guidelines] - currently suggested ones are: detailed Producer debugging: broker,topic,msg. Consumer: consumer,cgrp,topic,fetch)
*Restart the daemon via <code>systemctl restart eventlogging-processor@client-side-01.service</code>
[[Category:Data platform]]
[[Category:Data platform systems]]
j1r0289yk0qx5elhf4xx84w127w7axu
Obsolete:Analytics/Archive/EventLogging/Performance
110
25676
2421377
2259954
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Performance]] to [[Obsolete:Analytics/Archive/EventLogging/Performance]]: pages are obsolete
2201888
wikitext
text/x-wiki
Put any performance related Event Logging information on this page.
== Event Logging load test ==
Setup:
* using analytics1004.eqiad.wmnet, a beefy if not trusty Cisco box (analytics1004.eqiad.wmnet as of this writing)
* 31492 events, 4 schemas, split 3/3/3/1, generated by the wonderful test_load.py
* format: seqId[space]?[qson event]
Processor
* 4 runs average: 17.6478 seconds, so ~'''1784''' events per second
* command: <code>time cat /home/milimetric/load.test.seq.30k | python ./eventlogging-processor "%{seqId}d %q" stdin:// file:///home/milimetric/out.load.30k</code>
Processor (take 2)
* using a patch from Andrew but that shouldn't affect performance for file reading (https://gerrit.wikimedia.org/r/#/c/221664/)
* consuming from a file generated using kafkacat now, with the standard format expected in prod
* 30000 events exactly, as sampled from the prod stream, so some of them are invalid (truncated, bad data, etc)
* 4 runs average: 21.3743, so ~'''1403''' events per second
* command: <code>time cat /home/milimetric/load.test.sample | python bin/eventlogging-processor "%q %{recvFrom}s %{seqId}d %t %h %{userAgent}i" stdin:// file:///home/milimetric/out.load.sample</code>
* It seems even small variations in format and number of invalid events affects performance quite a bit, so we should have a fairly large margin because we could get a spike of invalid events
Processor (take 3 - direct from kafka)
* reading from the beginning of a kafka topic so we're not limited by amount of events
* command: <code>time python bin/eventlogging-processor "%q %{recvFrom}s %{seqId}d %t %h %{userAgent}i" "kafka:///analytics1012.eqiad.wmnet:9092?topic=eventlogging-client-side&auto_offset_reset=smallest" file:///home/milimetric/out.load.from.kafka</code>
* 40.229 seconds: 40467 events
* 78.812 seconds: 78185 events
* 54.090 seconds: 65253 events
* Average: ~'''1062''' events per second with a fairly high variance again suggesting we should give ourselves a large margin
Processor (take 4 - direct from kafka but with custom code to make all events invalid)
* Idea: does the processing rate suffer if all events are invalid
* 18.938 seconds, 13348 events, '''704''' per second
Processor (take 5 - all events invalid, no logging to slow down the console output)
* Idea: does the logger output which goes to the console in theses tests slow down the processing?
* 28.326 seconds, 76995 events, '''2719''' per second!!!
Processor (take 6)
* using Andrew's fancy parallel processor
* 1 process, 100k events: 71.439 seconds, 68.734
* 2 processes, 100k events: 35.335 seconds, 35.794
* 5 processes, 100k events: 15.269 seconds, 15.124
* 7 processes, 100k events: 12.959 seconds, 12.772
* 10 processes, 100k events: 11.291 seconds, 11.xxx
* 12 processes, 100k events: 11.000 seconds, 10.831
* 24 processes, 100k events: 9.713 seconds, 9.843
Consumer
* NOTE: mysql server was installed on the same host as the eventlogging-consumer
* NOTE: tried this with four different batch sizes (400, 1000, 4000, 8000) and it didn't seem to make a substantive difference. Therefore I just used the default configured batch size of 400. This indicates that most likely the performance bottleneck is in preparing and organizing the events for insertion and not the insertion itself.
* 4 runs average: 11.8945 seconds, so ~'''2647''' events per second
* command: <code>time cat ~/out.load.30k | python eventlogging-consumer stdin:// mysql://root:root@127.0.0.1/log?charset=utf8</code>
== Benchmarking DB inserts ==
When batching is turned on (i.e. events for distinct schemas are being grouped) this are some numbers we get in vanadium:
Inserting 300 events takes about 1 sec
Inserting 400 events takes less than 2 secs (~1.6)
Inserting 500 events takes about 2 secs
Inserting 600 events takes over 2 secs
Inserting 100/1500 events takes 3.5 secs
[[Category:Data platform]]
[[Category:Data platform systems]]
eai7r9cvxok969bjhj6fd2kylh8z7oi
Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation
110
26147
2421381
2259958
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Sanitization vs Aggregation]] to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]: pages are obsolete
2201890
wikitext
text/x-wiki
This page explains two strategies in sanitizing/aggregating EventLogging (EL) schema tables, so that they comply with the [[m:Data_retention_guidelines|Data Retention Guidelines]]. Specifically it will describe the sanitization/aggregation of the field editCount, which is a numerical field that many EL schemas share. Note that both options are not exclusive, and also that this example could be extended to other fields that share the same properties as editCount.
=== 1) Sanitization of editCount field ===
Consider the following EventLogging table:
{| class="wikitable"
!editCount
!field1
!...
!fieldN
|-
|0
|a
|...
|10
|-
|4
|b
|...
|20
|-
|78
|c
|...
|30
|-
|435
|d
|...
|40
|-
|17840
|e
|...
|50
|}
The idea is to add a new field named editCountBucket (or similar). Its value would be a bucketification of the editCount field, an enum like: "0 edits", "1-4 edits", "5-99 edits", "100-999 edits", "1000+ edits".
{| class="wikitable"
!editCount
!field1
!...
!fieldN
!editCountBucket
|-
|0
|a
|...
|10
|0 edits
|-
|4
|b
|...
|20
|1-4 edits
|-
|78
|c
|...
|30
|5-99 edits
|-
|435
|d
|...
|40
|100-999 edits
|-
|17840
|e
|...
|50
|1000+ edits
|}
After 90 days, only the editCount would be deleted, leaving the table with the other original fields plus editCountBucket:
{| class="wikitable"
!editCount
!field1
!...
!fieldN
!editCountBucket
|-
|
|a
|...
|10
|0 edits
|-
|
|b
|...
|20
|1-4 edits
|-
|
|c
|...
|30
|5-99 edits
|-
|
|d
|...
|40
|100-999 edits
|-
|
|e
|...
|50
|1000+ edits
|}
Thus, maintaining the data set non-aggregated (keeping all the non-sensitive data as is), and still permitting queries on a safe simplification of the editCount field.
==== Work needed ====
Here's a list of tasks to implement this solution:
# '''[Analytics team]''' Create a new version of the schema with the field editCountBucket.
# '''[Product team]''' Modify the instrumentation to use the new schema and populate the editCountBucket field.
# '''[Analytics team]''' Create a SQL script to update all the tables for the old revisions of the schema, adding the new field.
# '''[DBA]''' Run/schedule the update script and activate auto-purging of the editCount field after 90 days.
==== Issues related to mobile deployment flow ====
If the schema is populated from a single-version system, there would be no problems. But if the schema is populated from different versions of a mobile app, there will be always events coming to old revision tables, without the editCountBucket information. So the update of the old schema tables (step 3) should be executed periodically, maximum every 90 days, to ensure that no editCount fields get deleted without their respective editCountBucket fields having a value.
=== 2) Aggregation upon editCount field ===
The idea here is to use EventLogging report schedulers (generate.py, reportupdater) to daily store the desired custom SQL metrics to report files. Provided they do not persist sensitive information, they can be kept indefinitely. Note that the query reads the still entire non-sanitized data, because it executes within the last 90 days of events. And also, that after that period, the sensitive data in the tables should be purged. For example, given the same table as in the former example:
{| class="wikitable"
!editCount
!field1
!...
!fieldN
|-
|0
|a
|...
|10
|-
|4
|b
|...
|20
|-
|78
|c
|...
|30
|-
|435
|d
|...
|40
|-
|17840
|e
|...
|50
|}
You could write a SQL query like:<syntaxhighlight lang="sql">
SELECT
DATE(timestamp) as day,
SUM( IF( event_editCount = 0, 1, 0 ) ) AS "0 edits",
SUM( IF( event_editCount > 0 AND event_editCount < 5, 1, 0 ) ) AS "1-4 edits",
SUM( IF( event_editCount >= 5 AND event_editCount < 100, 1, 0 ) ) AS "5-99 edits",
SUM( IF( event_editCount >= 100 AND event_editCount < 1000, 1, 0 ) ) AS "100-999 edits",
SUM( IF( event_editCount >= 1000, 1, 0 ) ) AS "1000+ edits"
FROM <schema_table>
GROUP BY day
ORDER BY day
;
</syntaxhighlight>And schedule it via EventLogging schedulers. They would create a CSV/TSV report file that would look like this (numbers make no sense):<syntaxhighlight lang="python">
day, 0 edits, 1-4 edits, 5-99 edits, 100-999 edits, 1000+ edits
2015-01-01, 2846, 325, 27, 3, 1
2015-01-02, 2476, 292, 25, 4, 1
2015-01-03, 3012, 321, 19, 3, 2
...
</syntaxhighlight>These kind of reports have the advantage that they can be very easily displayed in a Dashboard using Dashiki. Also, the disadvantage they can not be queried via SQL.
==== Work needed ====
Here's a list of tasks to implement this solution:
# '''[Product team]''' Write the SQL queries in the context of EL schedulers.
# '''[Analytics team]''' If the product team doesn't have an instance of the schedulers running yet, create a new repository for it and add puppetize the execution of it
[[Category:Data platform]]
[[Category:Data platform systems]]
2byj6dmrjv1xcoznvbtve0hu1x78qbz
Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation
111
26215
2421399
2259978
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]] to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]: pages are obsolete
1755644
wikitext
text/x-wiki
To be honest I'm pretty agnostic to the proposed solutions aside from the amount of work involved to implement.Having said that, it seems the 2nd option is a little less intensive, so I lean that way. <blockquote>'''Update from Analytics:''' You're right, option 2 has less to-dos, but that's mainly for the Analytics team, the quantity of work for the Mobile team would be similar in both cases, and out preference would be option 1 for several reasons:</blockquote><blockquote>'''In option 1''', Mobile team needs to modify the instrumentation, '''but this doesn't need to happen right away''', Analytics can carry on with the EL audit without the new instrumentation. We just want to make sure that new schemas created in the future possess the new field. On the other hand option 1 is less aggressive with the data, bucketizing the editCount field instead of deleting it.</blockquote><blockquote>'''In option 2''', Mobile team needs to determine and implement a set of metrics they want to persist in the reports '''before the auto-purging starts'''. Otherwise the new metrics won't be able to have historic data on editCount field. So Analytics EL audit would be blocked on this task for Mobile team.</blockquote>I saw the comment about how it wouldn't be query-able via SQL… but there isn't anything stopping us from storing the results in a database rather than a log file… which may or may not be helpful in this case. Either way, I'd like to see people more affected by this than me chime in as I think there preferences are more important.<blockquote>'''Update from Analytics:''' You're right here as well, there's no technological reason stopping us to store that data in another DB. We'd just prefer not to have various DBs storing EventLogging potentially sensitive data. We'd like to have a single source of that data for easier control. We consider the EL report pipeline as part of EL system and we'd prefer to use it to create the report files, vs a new replication-like feature.</blockquote>[Above from Corey Floyd]
== Does this affect the Limn graphs / Mobile report card? ==
Just wondering if this would affect http://mobile-reportcard.wmflabs.org/#apps-graphs-tab. There we query for much older data than just 90 days.
My guess is no because those SQL queries don't seem to be using the editCount field. Instead, those reports calculate number of events using the COALESCE and SUM functions. Just checking. [[User:BearND|BearND]] ([[User talk:BearND|talk]]) 20:25, 29 July 2015 (UTC)<blockquote>'''Update from Analytics:''' Exactly, there are 2 ways of querying EL database for reports: 1) incrementally day by day, which is not a problem because it will use only recent data; and 2) globally, in this case the query can not use auto-purged fields. But in short, the '''mobile-reportcard will continue working normally'''. In fact, we Analytics are working on it right now to unbreak several reports that were stuck.</blockquote>
== I like Option 2 better ==
Seems to be much less work, seems like no data will be deleted, and we can dump the TSV reports to a database or use something like [http://harelba.github.io/q/ this] to query it if we need so. [[User:Jhernandez|Jhernandez]] ([[User talk:Jhernandez|talk]]) 12:42, 30 July 2015 (UTC)<blockquote>'''Update from Analytics:''' In fact, '''with option 2 more data will be deleted''': option 1 means delete userId and bucketize editCount, option 2 means deleting both userId and editCount. Sorry if this was not clear. Please read also the other comments on both options in the first question of the page.</blockquote>
::Then I guess Option 1 is the simplest (only one place to go for data) instead of having separate stored reports for the edit buckets. Whatever you guys think is better. [[User:Jhernandez|Jhernandez]] ([[User talk:Jhernandez|talk]]) 10:51, 31 July 2015 (UTC)
== Looks like Option 1 turns out to be easier ==
Based on the discussion here and further talking with Joseph and Kevin, it looks like Option 1 is actually less work for Reading and also keeps the audit and necessary purging on track, all the while without breaking Limn graphs. --[[User:Dr0ptp4kt|Dr0ptp4kt]] ([[User talk:Dr0ptp4kt|talk]]) 23:29, 4 August 2015 (UTC)
:Quick comment: I don't know what's better, but option 1 is easier to understand. [[User:Nemo_bis|Nemo]] 21:36, 31 January 2016 (UTC)
j6byv6exhilgtqzt96pe9qew0snpxj0
Obsolete:Analytics/Archive/EventLogging/Sensitive Fields
110
26420
2421385
2259962
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Sensitive Fields]] to [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]: pages are obsolete
2201892
wikitext
text/x-wiki
=== Scenario 1: Entering personal information by mistake ===
A user enters sensitive personal information (credit card number, or others) into a text field like username, pageTitle, imageTitle, summary, etc. by mistake. After a while they realize that and immediately ask WMF to delete this information. A risk exists, that in the meantime, after the sensitive data was submitted and before its records were deleted, an EventLogging event was stored featuring this data.
* Thus, all fields that store '''textual information potentially inputed by the user''' (username, pageTitle, etc.) should be auto-purged in events older than 90 days.
=== Scenario 2: Editing as anonymous by mistake ===
An editor usually edits logged in, because they don't like their IP being stored in the revision table. However, this day they edit as anonymous by mistake. After a while they realize that and immediately ask WMF to delete this information. A risk exists, that in the meantime, after the sensitive data was submitted and before its records were deleted, an EventLogging event was stored featuring this data.
* Thus, all fields that store the '''IP address of an anonymous editor''' should be auto-purged in events older than 90 days.
[[Category:Data platform]]
[[Category:Data platform systems]]
j6l1cx1mppn46sffuta9ttq5rkvne0k
Obsolete:Analytics/Archive/EventLogging/Architecture
110
26947
2421349
2259926
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Architecture]] to [[Obsolete:Analytics/Archive/EventLogging/Architecture]]: pages are obsolete
2201881
wikitext
text/x-wiki
{{Notice|This documentation is outdated. See [[Event_Platform#Event_Platform_documentation_pages|Event Platform documentation]].}}
This page explains WMF's EventLogging system topology and how its parts interact. Using the following diagram as a reference:
[[File:EventLoggingStag.jpg|frameless|EventLogging architecture]]
* [https://github.com/wikimedia/varnishkafka varnishkafka] sends client-side raw (URL encoded JSON in query string) events from Varnish to eventlogging-client-side Kafka topic.
* An eventlogging-processor consumes and processes these raw events and send them back to Kafka as JSON strings. Once processed and validated, the processed events are produce to Kafka in the topics: eventlogging-valid-mixed and eventlogging_<schemaName>. eventlogging-valid-mixed that contains the valid events from all schemas with the exception of blacklisted high volume schemas. eventlogging_<schemaName> holds all events for each schema.
* eventlogging-valid-mixed is consumed by eventlogging-consumer processes and stored into MySQL and into the eventlogging log files. The eventlogging_<schemaName> topics are consumed by Camus and stored in HDFS partitioned by <schemaName>/<year>/<month>/<day>/<hour>
The EventLogging back-end is comprised of several pieces that consume and produce from/to Kafka, which makes it a single purpose standalone stream processor. The /etc/eventlogging.d file hierarchy contains those process instance definitions. It has a subfolder for each service type. An systemd task, uses this file hierarchy and provisions a job for each instance definition. Instance definition files contain command-line arguments for the service program, one argument per line.
An 'eventloggingctl' shell script provides a convenient wrapper around for managing EventLogging processes.
[[Category:Data platform]]
[[Category:Data platform systems]]
1yo8yyk9c9iwbelu5w7ws8b4r5pr3wh
Data Engineering/Systems/EventLogging/How to
0
26948
2421460
2266533
2026-05-31T09:22:14Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421460
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration#How Tos]]
jtj92pzy0a76sqa6yjslz4k2bvfmnhj
Obsolete:Analytics/Archive/EventLogging/Data representations
110
26955
2421353
2259930
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data representations]] to [[Obsolete:Analytics/Archive/EventLogging/Data representations]]: pages are obsolete
2201883
wikitext
text/x-wiki
This page gives an overview over the various representations of EventLogging data available on the WMF production cluster, and expectations around those representations.
== Hadoop & Hive ==
EventLogging analytics data is imported from Kafka into Hadoop as raw JSON, and then 'refined' into Parquet backed Hive tables. These tables are in the Hive <code>event</code> and <code>event_sanitized</code> databases. The refined data is stored in HDFS in the <code>hdfs:///wmf/data/event</code> directory. And the sanititized data is stored under <code>hdfs:///wmf/data/event_sanitized</code>.
See: [[Analytics/Systems/EventLogging#Hadoop_.26_Hive]] for info on how to access this data.
== 'all-events' JSON log files ==
Use this data source only to debug issues around ingestion into the m2 database (data ingested only on hadoop does not go through these files)
Entries are JSON objects.
Only validated events get written.
In case of bugs, historic data does not get fixed.
Those files are available as:
* <code>stats1004:/srv/log/eventlogging/archive/all-events.log-$DATE.gz</code>
* <code>stats1005:/srv/log/eventlogging/archive/all-events.log-$DATE.gz</code>
* <code>eventlog1002:/var/log/eventlogging/...</code>
== Raw 'client' side log files ==
Use this data source only to debug issues around ingestion into the m2 database.
Entries are parameters to the <code>/beacon/event</code> HTTP request. They are not decoded at all.
In case of bugs, historic data does not get fixed. Neither need hot-fixes reach those files.
Those files are available as:
* <code>stats1004:/srv/log/eventlogging/archive/client-side-events.log-$DATE.gz</code>
* <code>stats1005:/srv/log/eventlogging/archive/client-side-events.log-$DATE.gz</code>
*
* <code>eventlog1002:/var/log/eventlogging/...</code>
== Kafka ==
EventLogging now feeds the following topics in Kafka:
* '''eventlogging-valid-mixed''': This topic exists for ingestion into MariaDB and contains most of the live EventLogging analytics data. Some schemas are blacklisted.
* '''eventlogging_<schemaName>''': All events from the specified schema. Each schema has its own deditcated topic.
== Varnish pipeline ==
Since EventLogging data is extracted at the bits caches, and the EventLogging payload is encoded in the URL, EventLogging data is available in all log targets from the caches.
In case of bugs, historic data does not get fixed. Neither need hot-fixes reach this pipeline.
[[Category:Data platform]]
[[Category:Data platform systems]]
m469l4u5ex1jgivdwgutksalnjkv5al
Obsolete:Analytics/Archive/EventLogging/Monitoring
110
26957
2421367
2259944
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Monitoring]] to [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]: pages are obsolete
2201885
wikitext
text/x-wiki
== Grafana ==
Here's some really cool Grafana dashboards that display our Graphite metrics:
* [https://grafana.wikimedia.org/#/dashboard/db/eventlogging EventLogging Dashboard], shows hitrates of all events by schema, status of Kafka brokers, error counts and more.
* [https://grafana.wikimedia.org/dashboard/db/eventlogging-schema EventLogging-schema Dashboard], focus on an individual schema. Useful for embedding and inter-dashboard links from graphs showing the actual EventLogging data. Example: [https://grafana.wikimedia.org/#/dashboard/db/performance-metrics performance-metrics].
== Graphite ==
Raw metrics can be browsed at https://graphite.wikimedia.org/ (under Metrics -> eventlogging).
We publish 4 types of counts to graphite:
* Overall counts
* Per schema counts
* Server side counts
* Client side counts
Within the overall counts, there are 4 submetrics:
* '''Raw counts''': Number of all events that reach the system.
* '''Valid counts''': Number of events that pass validation.
* '''insertAttempted''': Number of events that get queued up for insertion to MySQL database.
* '''inserted''': Number of events that get actually inserted into MySQL database.
[[Category:Data platform]]
[[Category:Data platform systems]]
qplwhyjoo4emlk7js6hujptp9dcyd3w
EventLogging
0
26960
2421492
2266556
2026-05-31T09:22:53Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421492
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Talk:EventLogging
1
26961
2421515
2266576
2026-05-31T09:23:20Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421515
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
EventLogging/Performance
0
26962
2421496
2266560
2026-05-31T09:22:57Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Performance]]
2421496
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Performance]]
qjrz2uod4uyospuds419rd1uxhkdq3p
EventLogging/Testing/BetaLabs
0
26963
2421497
2266561
2026-05-31T09:22:59Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421497
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Analytics/EventLogging Sanitization vs Aggregation
0
26964
2421434
2266505
2026-05-31T09:21:42Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421434
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
ozb5iyz2xjb7w6y5xu3x21aukwi414q
Talk:Analytics/EventLogging Sanitization vs Aggregation
1
26965
2421505
2266567
2026-05-31T09:23:08Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421505
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
lzpyfc1dpi1gsu0axf5047tshmgp3fh
EventLogging/Oncall
0
26967
2421494
2266558
2026-05-31T09:22:55Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421494
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
EventLogging/Operations/Outages
0
26968
2421495
2266559
2026-05-31T09:22:56Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Outages]]
2421495
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Data Engineering/Systems/EventLogging/TestingOnBetaLabs
0
27074
2421471
2266545
2026-05-31T09:22:27Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421471
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Obsolete:Analytics/Archive/EventLogging/Publishing
110
29004
2421379
2259956
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Publishing]] to [[Obsolete:Analytics/Archive/EventLogging/Publishing]]: pages are obsolete
2201889
wikitext
text/x-wiki
{{draft}}
{{note|See WMF's official data publication guidelines at https://foundation.wikimedia.org/wiki/Legal:Data_publication_guidelines}}
WMF's EventLogging database is private, because it may hold sensitive information during a certain time window. To access it, one must be an employee of the Wikimedia Foundation or have signed an NDA. Hence, any reports or data sets based on EventLogging data are potentially harmful and need to be subject of review before they can be published.
== Publishing reports ==
We consider a report: any collection of prose, graphs and statistics that has been drafted by a human for the purpose of communicating some learning. It typically does NOT contain actual records of the database, nor parts of them.
Before publishing any report, you should ensure that it does not contain any potentially sensitive information, as defined bellow. If you are unsure if your report contains private data, please consult with the Research or Analytics teams.
== Publishing data sets ==
We consider a data set: a collection of (whole or partial) records extracted from the database for the purpose of enabling future analyses.
The preferred option is NOT to release any such data sets publicly. If you'd like to open an exception, please contact the Legal team AND also the Community Advocacy team to review your data set, and ensure that it contains no sensitive data. If you have other questions, please ask the Analytics team or the Research team.
=== Potentially sensitive data ===
* PII (Personally identifiable information), like clientIp, userAgent, userName, userId, editCount, and in general, any piece of information that can uniquely identify a physical or virtual person.
* User-inputed textual fields, like pageTitle, imageTitle, summary, userName, userText, etc. Schemas containing this kind of data are marked as such in the schema talk page.
[[Category:Data platform]]
[[Category:Data platform systems]]
exq02thu7pmu8xyildhzg06cclq7z1z
EventLogging/Backfilling
0
32589
2421493
2266557
2026-05-31T09:22:54Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
2421493
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
ixtg113mywcmt0chk7x1xzr8d83eidp
Data Engineering/Systems/EventLogging/Oncall
0
206521
2421463
2266537
2026-05-31T09:22:18Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421463
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/EventLogging
0
440621
2421419
2266490
2026-05-31T09:21:24Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421419
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Analytics/EventLogging/Administration
0
440622
2421420
2266491
2026-05-31T09:21:25Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421420
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/EventLogging/Architecture
0
440623
2421421
2266492
2026-05-31T09:21:26Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
2421421
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
563zn02ep8hvymxagfoiuv51v07t6wn
Analytics/EventLogging/Backfilling
0
440624
2421422
2266493
2026-05-31T09:21:28Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
2421422
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
ixtg113mywcmt0chk7x1xzr8d83eidp
Analytics/EventLogging/Data representations
0
440625
2421423
2266494
2026-05-31T09:21:29Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
2421423
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
5e27mterkyfcdw6nxfeaiwk2jadfz6n
Analytics/EventLogging/How to
0
440627
2421424
2266495
2026-05-31T09:21:30Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421424
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/EventLogging/Monitoring
0
440628
2421425
2266496
2026-05-31T09:21:31Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
2421425
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
car9duvgoadqdlvtghdnzp1qalu0ntt
Analytics/EventLogging/Oncall
0
440630
2421426
2266497
2026-05-31T09:21:33Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421426
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/EventLogging/Outages
0
440631
2421427
2266498
2026-05-31T09:21:34Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Outages]]
2421427
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Analytics/EventLogging/Performance
0
440632
2421428
2266499
2026-05-31T09:21:35Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Performance]]
2421428
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Performance]]
qjrz2uod4uyospuds419rd1uxhkdq3p
Analytics/EventLogging/Publishing
0
440633
2421429
2266500
2026-05-31T09:21:36Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
2421429
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
nmbj4rbsh37le6a3rekioqs0hzpkwp1
Analytics/EventLogging/Sanitization vs Aggregation
0
440634
2421430
2266501
2026-05-31T09:21:37Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421430
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
ozb5iyz2xjb7w6y5xu3x21aukwi414q
Analytics/EventLogging/Sensitive Fields
0
440635
2421431
2266502
2026-05-31T09:21:39Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
2421431
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
k9nkblptw4pz7l3cvunzjff64jjqh0w
Analytics/EventLogging/TestingOnBetaCluster
0
440636
2421432
2266503
2026-05-31T09:21:40Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421432
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Analytics/EventLogging/TestingOnBetaLabs
0
440637
2421433
2266504
2026-05-31T09:21:41Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421433
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Talk:Analytics/EventLogging
1
440638
2421503
2266565
2026-05-31T09:23:06Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421503
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Talk:Analytics/EventLogging/Sanitization vs Aggregation
1
440639
2421504
2266566
2026-05-31T09:23:07Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421504
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
lzpyfc1dpi1gsu0axf5047tshmgp3fh
Obsolete:Analytics/Archive/EventLogging/Schema Guidelines
110
441058
2421383
2259960
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Schema Guidelines]] to [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]: pages are obsolete
2201891
wikitext
text/x-wiki
{{Notice|This documentation is outdated. See [[Event_Platform/Schemas/Guidelines]].}}
The Analytics Engineering team is considering establishing some schema guidelines that would make ingestion into Data Analysis tools easier. The current idea is that we can automatically load conforming schemas into [[Analytics/Systems/Druid|Druid]] and make them available for analysis in [[Analytics/Systems/Turnilo-Pivot|Turnilo (formerly Pivot)]] or [[Analytics/Systems/Superset|Superset]], but the particular technologies used aren't too important. This is just a draft set of guidelines, started in 2017 as part of a collaboration with the Mobile Apps team to see how these would work in practice.
EventLogging Druid ingestion is manually configured. If you want some specific EventLogging data to make it into Druid, ask the analytics team for help.
'''NOTE''': these guidelines are subject to change as part of the Modern Event Platform Schema Guidelines component. The basic ideas will remain the same, but some naming conventions may change. In the future, we'd like to improve these conventions to make automated Druid ingestion possible.
See also [[Event Platform/Schemas/Guidelines|Modern Event Platform Schema Guidelines]].
= Guidelines =
While events are originally sent using JSON, they need be persisted to a SQL datastore so restrictions apply.
NEVER MAKE BACKWARDS INCOMPATIBLE CHANGES. This means that (almost) the ONLY change you can make to a schema is add new optional fields.
There is no such thing as 'renaming' a field if the old field name exists in old data. You can only add new optional fields. The latest version of every schema should validate with every event of that schema ever produced.
== Schema set up ==
* Event date time should always be stored in a field called <tt>dt</tt>, in ISO 8601 format in UTC +00:00 timezone.
* Other date time fields should be in ISO 8601 format in fields suffixed with '_dt', e.g. <tt>session_start_dt</tt>
* If you must use an integer unix epoch timestamp, send it in millisecond precision in UTC +00:00 timezone and name the field suffixed with 'ts', e.g. <tt>session_start_ts</tt>
* The schema should be as flat as possible. Don't send complex objects in a single field, flatten them out and send an event as shown below. Complex events cause more work and possible confusion down the line during analysis.
* Do not remove fields when making changes to the schema in the future. Restricting schema changes to only adding fields keeps schema (and event code) backwards compatible and doesn't break queries. Otherwise queries would need to be revised every time the schema is changed.
* All arrays must specify the items type. E.g.
<syntaxhighlight lang="javascript">
"array_field": {
"type": "array",
"items": {
"type": "string"
}
}
</syntaxhighlight>
* Types should never change. This is tricky with JSON, as both decimals and integers are valid <code>number</code>s. If you want integers, please use the <code>integer</code> type. If you want decimals, use the <code>number</code> type, but you'll need to make sure that the values ALWAYS have a decimal point in them. 2.0 is a valid float number, 2 is not. You'll run into trouble if your data has both of these formats for the same field.
* If the schema has any fields that measure time elapsed, use milliseconds as the time unit.
*Union types are not supported. That means that you cannot send null for fields that are type string. If you want an optional field, make it not required, and just don't set it in your data.
* EventLogging + Hive now supports map types. They are specified in JSONSchema like:
<syntaxhighlight lang="javascript">
"map_field": {
"type": "object",
"additionalProperties": {
"type": "string" // or whatever type your values are.
}
}
</syntaxhighlight>
* If there are fields (or sets of fields) that are mutually exclusive they should be "named" distinctively. Example:
If events look like this:
<syntaxhighlight lang="javascript">
{
user: 'TheWikiEditor',
editorship: {is_active:'yes', edit_count:'1000'}
}
</syntaxhighlight>
Do not use a marker like the following one to indicate "absence" of data
<syntaxhighlight lang="javascript">
{
user: 'TheBadWikiEditor',
editorship: {status:'blocked'}
}
</syntaxhighlight>
Rather use an explicit field like:
<syntaxhighlight lang="javascript">
{
user: 'TheBadWikiEditor',
blocked: 'true' //mutually exclusive with being able to retrieve editorship status
}
</syntaxhighlight>
== Ingestion into Druid ==
* All fields are "dimensions" by default. Fields that should be measures in Druid should be configured as such in EventLoggingToDruid configs. (see section below explaining what dimensions are).
=Example Schema Conforming to Guideline=
Let's say we wanted to understand feature usage for a mobile app. We might have a schema that looks like this:
<syntaxhighlight lang="javascript">
{
dt: 'ISO 8601 formatted timestamp, eg. 2015-12-20 09:10:56'
app_platform: 'Platform, Operating System',
app_platform_version: 'Version of the OS',
app_version: 'The version of the app',
feature_category: 'A feature category, to allow analyzing groups of features',
feature: 'The name of the feature',
time_since_last_action: 'In milliseconds, how much time passed since the last user action and until they engaged this feature',
time_spent: 'In milliseconds, how much time did the user spend using the feature'
}
</syntaxhighlight>
=Dimension=
Dimension in data modeling is a construct that categorizes data. In the Druid sense, we're usually talking about [https://en.wikipedia.org/wiki/Dimension_(data_warehouse)#Degenerate_dimension degenerate dimensions] which are basically like labels for your data. Examples are: country, project, agent type, app version, browser, etc.
=== Regarding Dimensions with "many" Numeric Values ===
Druid is not the best tool to manipulate numeric data, Druid excels on manipulating cubes of "dimensions" each of which has low cardinality (distinct values that a dimension can take) like: "pageviews per country for Chrome across all wikimedia projects". Numeric values that are not bucketed have "infinite" cardinality and thus are not well suited for druid ingestion as dimensions. When ingested as "measures" please be aware that druid supports a limited set of aggregations: http://druid.io/docs/latest/querying/aggregations
==== Bucketing time measures ====
Fields that contain time elapsed values, can not be treated as metrics (measures) in Druid. However, our EventLogging Druid ingestion pipeline allows to bucket time elapsed fields into buckets, transforming the field to an 'ingestible' dimension, with values like: "100ms-1sec", "10sec-1min", etc. So, you can still consider having time elapsed fields in your schema.
== See also ==
*[[Analytics/Systems/Hive to Druid]]
Real life examples:
* Ingestion of the [[Analytics/Data Lake/Traffic/Pageview hourly|pageviews_hourly]] Hadoop table into Druid: [https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pageview/druid/daily/load_pageview_daily.json.template load_pageview_daily.json.template] (specifying dimensions and metrics), [https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pageview/druid/daily/generate_daily_druid_pageviews.hql generate_daily_druid_pageviews.hql] (daily Hive query exporting the Hadopp table)
*[[phab:T202751|T202751]] "Ingest data from PageIssues EventLogging schema into Druid"
[[Category:Data platform]]
[[Category:Data platform systems]]
8peyxnhhtewv18pj2asz43mx330ylxf
Obsolete:Analytics/Archive/EventLogging/EventCapsule
110
441996
2421363
2259940
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/EventCapsule]] to [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]: pages are obsolete
2201884
wikitext
text/x-wiki
This page will document EventLogging's EventCapsule, a wrapper that stores meta information about EventLogging events. The EventCapsule is stored in the EventLogging source code in [https://github.com/wikimedia/eventlogging/blob/master/eventlogging/capsule.py eventlogging/capsule.py]. Before February 2018, [https://meta.wikimedia.org/w/index.php?title=Schema:EventCapsule it was stored] with other schemas on meta wiki. As part of https://phabricator.wikimedia.org/T179836, it was moved to source code, as the codebase's operation is highly dependent on the EventCapsule structure. Changes to EventCapsule need to be coordinated with source code deployments.
https://meta.wikimedia.org/wiki/Schema:EventCapsule will be maintained for documentation purposes, but the canonical EventCapsule schema lives at https://github.com/wikimedia/eventlogging/blob/master/eventlogging/capsule.py eventlogging/capsule.py.
[[Category:Data platform]]
[[Category:Data platform systems]]
a2xhzqymdvubv58vd0k703atcxw0t8d
Obsolete:Analytics/Archive/EventLogging/NotErrorLogging
110
442848
2421371
2259948
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/NotErrorLogging]] to [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]: pages are obsolete
2201886
wikitext
text/x-wiki
== Eventlogging is not well suited to do error logging ==
There had been many talks in the past (and present, see: {{Phabricator|T203814}}) about using eventlogging to do client side errorl ogging. While eventlogging is well suited to handle events, and ingesting client side traces shares similarities with ingesting events, Eventlogging is really not well suited to be a client side error logging library for several reasons:
1) Eventlogging is designed around handling data that validates to a schema, while error messages might be json there is really no value on validating them against a schema. The error happened and it should be ingested by backend regardless of whether it validates, it is not "curated data".
2) Any system we use to handle error logging should be tier-1, EventLogging is tier-2.
3) Any system to handle errors needs to be able to group by stacktraces and be good at handling free text, EventLogging is not for reasons in 1). It is made to deal with data that abides to a schema. It does not group events by free text (like stack traces) and an error that appears for a million pageviews in 1 hour will appear in the database a million times rather than appearing once with a count of 1 million. Because EventLogging is made for distinct events, and errors are not distinct events. All users of Chrome 68 might be running on the same error for the same reasons. This is a big deal and why a solution customized to the error space is needed. See 4).
4) There is no need to reinvent the wheel, [Sentry https://sentry.io/welcome/] is a well-stablished software to do this very thing: client side error logging. See attempts to install Sentry at the foundation: https://phabricator.wikimedia.org/tag/sentry/
While grouping server side stack traces is a core usage of sentry in order to deal with bursty traffic an error logging solution probably needs to do some client side normalization of stack traces and deduplication so errors are somewhat processed by the time they get to the server side. Sentry comes with a client side library that does part of this pre-processing.
5) Privacy concerns. This is a smaller concern that the ones listed prior but since it has come up listing it for completion: A log system is normally short retention. EL data is retained for 90 days and the whitelisted data is retained for a longer term. Erroneously whitelisting error messages for longer retention can lead to privacy concerns such as these: https://phabricator.wikimedia.org/T136851
[[Category:Data platform]]
[[Category:Data platform systems]]
inreshbirfdvydueipxabie69vcrrwp
Obsolete:Analytics/Archive/EventLogging/User agent sanitization
110
442884
2421391
2259968
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/User agent sanitization]] to [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]: pages are obsolete
2201894
wikitext
text/x-wiki
== Problem ==
[[:w:User_agent|User-agent strings]] often encode sufficiently many bits of information about the user's setup that they can be used to pinpoint an individual user, especially when
used in conjunction with other seemingly-anonymous datapoints. At the same time, user-agent strings are indispensable for performance measurements, front-end error reporting and browser support analysis.
Historically, EventLogging left user-agent logging and processing up to the developers and analysts. The problem with this approach is that the sanitization of user-agent strings is not uniform. When it is done at all, it is often done inconsistently. This often ends up being a barrier for releasing performance-related datasets that would greatly benefit from additional scrutiny.
The proposed solution is to centralize the processing of user-agents at the time of data collection.<ref>[[bugzilla:52295]]: Add sanitized User-Agent to default fields logged by EventLogging
</ref> This way we can be confident that all UAs logged via EventLogging are adequately sanitized.
== Use cases for User Agent collection ==
Why do we need user agent data?
'''To assess support needs.''' For example, we want to be sure we support every device with usage over x %, so we need to have device data reported with a precision of x.
'''Prioritizing Bug Fixes.''' In order to know whether a bug that affects a certain browser is a "must-fix" or a "nice to have" we need to have browser makeup stats. In both desktop and Mobile.
'''To plan feature work.''' We need to know, for example, the percentage of tablets among the users using wikipedia's mobile apps to plan application development. Or the number of users with "newer"
browsers to plan work for Visual Editor.
'''To contextualize performance numbers.''' For any kind of performance work we do we need to have browser data available to target our efforts smartly. An example of a performance schema in EL: https://meta.wikimedia.org/wiki/Schema:NavigationTiming
== Fingerprinting: background ==
'''Entropy''' is the mathematical quantity that measures density of information, in this case it can be thought of how close a fact comes to revealing somebody's identity uniquely. Uniquely identifying a visiting user is referred to as '''"fingerprinting".''' When we learn a new fact about a person it reduces the entropy of their identity by a certain amount. Thus, the entropy of a user agent is the set of observable characteristics that can be used in concert with others to uniquely identify a user.
Entropy is normally measured in bits. For example, Peter Eckersley's Panopticlick study for the EFF<ref>https://www.eff.org/deeplinks/2010/01/tracking-by-user-agent</ref> finds that the User-Agent header provides about 10.0 bits of entropy. Since 2^10 == 1024 that means only 1 in 1024 random browsers visiting a site are expected to share the same user-agent header. Our goal when sanitizing user agents is to reduce the information the user agent provides about the user while still keeping enough information available to do performance diagnostics.
== Sanitization ==
The steps taken to '''further reduce the entropy''' of a user agent are quite simple:
# We remove information pertaining to language first. These are headers like en-ES that are present in some user agents.
# We remove minor versions. For example: AppleWebkit/525.3.1 gets transformed to AppleWebkit/525
# We remove information regarding toolbars, extensions, plugins, builds, flash and Java when we find it. Disclaimer: these steps might include fully processing the user agent to identify the device, OS (major and minor) and browser (major). Pre processing UA has the advantage of filtering data such that we only store the data we care about. It, however, has the disadvantage to couple logging with a UA parsing solution. Also since data is pre processed we have lost the raw data that mistakes in the parsing library will not be easy to correct.
# To further sanitize the data any solution needs to incorporate bucketing and mark all UAs that haven't been encountered in the last N requests (for some appropriate value of N) into a generic "Other/unknown" bucket.
=== Caveats ===
== Aggregation ==
While sanitization reduces the amount of private information a User Agent contains it still leaves open the concern that you can pinpoint a browser session. The solution to this problem is reporting
data in an aggregated fashion, thus discarding the original dataset and just leaving agreggates, like: "3% of users have an iPhone 5".
=== Caveats ===
Aggregation requires that you know before hand the reports you want to produce from the data. If original records are discharged it reduces your ability to explore the dataset.
== UA in EventLogging vs HTTP headers ==
It should be noted that the user-agent header is sent by the client by default regardless of what EventLogging explicitly logs in the [[:m:Schema:EventCapsule|EventCapsule]]: if we send a sanitized version of the UA string into the event query string on the client-side, it still gets sent in full with every request. Since we cannot prevent the client from sending raw UA headers, the current proposal is to make this data unavailable to any downstream subscriber of EventLogging data: we apply the canonicalization /sanitization as part of the parsing that precedes validation and broadcasting to subscribers. All the relevant EventLogging data consumers would be downstream relative to that. The current implementation doesn't perform any further transformation.
== References==
{{reflist}}
== Further reading ==
* https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy
* https://w3c.github.io/fingerprinting-guidance/
* http://panopticlick.eff.org/browser-uniqueness.pdf
* https://phabricator.wikimedia.org/diffusion/EEVL/browse/master/includes/EventLogging.php;f65dd5eedcc183c9a8a3319be56153209eba6221$61
[[Category:Data platform]]
[[Category:Data platform systems]]
mxxwaqdwx5wghq5fajzn90234cin2i7
Obsolete talk:Analytics/Archive/EventLogging/Administration
111
446065
2421393
2259972
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Administration]] to [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]: pages are obsolete
1867164
wikitext
text/x-wiki
{{ping|Nuria}}
If I try the code on the examples:
''sqoop list-tables --password-file '/user/nuria/mysql-analytics-research-client-pw.txt' --username research --connect jdbc:mysql://analytics-store.eqiad.wmnet/etwiki''
I get this error:
''20/05/22 20:58:26 ERROR manager.CatalogQueryManager: Failed to list tables''
''com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure''
I've already started the kinit, so that is not the problem. Any ideas of what I'm doing wrong? [[User:Diego]]
: Problem solved. The error was in the mysql server, replacing by: jdbc:mysql://dbstore1004.eqiad.wmnet:3313/etwiki works. (it doesn't work without specifying the port, at least for me. ) --[[User:Diego|Diego]] ([[User talk:Diego|talk]]) 02:38, 24 May 2020 (UTC)
tvkxfybjwl3o51kq4eozbj37kehvmzx
Help:Toolforge/Node.js
12
446456
2421340
2421328
2026-05-30T12:49:36Z
Clump
46696
Undid 2 revisions from [[Special:Diff/2421327|2421327]] until [[Special:Diff/2421328|2421328]] remove nonsense
2421340
wikitext
text/x-wiki
{{Template:Toolforge nav}}
[[W:Node.js|Node.js]] can run fairly well on Toolforge including with websocket support by running the following steps.
== Conventions ==
The Toolforge <code>toolforge webservice</code> command starts node.js web servers using convention rather than configuration. These conventions are expected by the Toolforge tooling:
* A ''$HOME/www/js/package.json'' file must exist as part of your tool's application code.
* Running <code>npm start</code> from the tool's ''$HOME/www/js'' directory must start the web server.
** This will happen automatically if your main script is found at ''$HOME/www/js/server.js'' [https://docs.npmjs.com/cli/start.html]
* The ''PORT'' environment variable will be set to the port that your web server is expected to listen on. When using the Kubernetes backend, PORT will always be ''8000''.
== Create a node.js web server ==
# Create a node.js web server. For example:
#:<syntaxhighlight lang="javascript">
var http = require('http');
var port = parseInt(process.env.PORT, 10) ; // IMPORTANT!! You HAVE to use this environment variable as port!
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello World\n');
}).listen(port);
</syntaxhighlight>
# Save the web server as <code>$HOME/www/js/server.js</code>.
# Make sure your node.js web server starts up properly when <code>npm start</code> is executed. The default way to do this is to name your main script <code>server.js</code>.
# Your server should bind to a port that is passed in as an environment variable (<code>PORT</code>). You can access this via <code>process.env.PORT</code>. Without this, your tool will not be found by the Nginx proxy.
The [https://github.com/siddharthvp/toolforge-node-app-base toolforge-node-app-base] template repository is available on GitHub which has the above-mentioned setups already done and some further boilerplate code, which can be used to get started quickly.
== Deploying a Vue JS Application using Node JS and Vite ==
# SSH into your instance and become your tool.
# Clone your project into the <code>$HOME/www/js</code> directory. If the directory doesn't exist, feel free to create it using the mkdir command. Ensure that the project is cloned such that the package.json is in the www/js directory using <code>git clone <nowiki>https://YOUR_PROJECT_URL</nowiki> .</code>
# Add a start script to your package.json file to install dependencies, build the static assets of your application, and run the server. For example: <syntaxhighlight lang="text">
"start": "npm install && npm run build && node server.js"
</syntaxhighlight>
# You'd also need to add "express" as a dependency to your package.json
# Create a node.js webserver in the <code>$HOME/www/js</code> directory using <code>nano server.js</code>.
# By default when the build process is completed, vite would create a directory named dist with your static files. These are the files we would be serving over the server. Hence, the basic server configuration to do this would be:<syntaxhighlight lang="javascript">
import express from "express";
const app = express();
const PORT = parseInt(process.env.PORT, 10); // IMPORTANT!! You HAVE to use this environment variable as port!
app.use(express.static("dist"));
app.listen(PORT, () => console.log(`Server listening on port: ${PORT}`));
</syntaxhighlight>
# To initialize the webserver use: <code>toolforge webservice {{toolforge latest image|node}} start</code>
# To check the logs/build process, use: <code>toolforge webservice {{toolforge latest image|node}} logs</code>
# Once the build process is complete and the server begins listening, you can view your Vue application by visiting: <code><nowiki>https://YOUR_TOOL_NAME.toolforge.org</nowiki></code>
== Kubernetes Configuration ==
The webservice command accepts the following parameters:
:<code>toolforge webservice {{toolforge latest image|node}} start|stop|restart|shell|logs</code>
# Put your node application in <code>$HOME/www/js</code> in your tool's home directory.
# Start the web service with the following <code>toolforge webservice {{toolforge latest image|node}} start</code>.
#* If the start fails you may need to create <code>$HOME/www/js/package.json</code> containing the text:
#:<syntaxhighlight lang="json">
{
"scripts": {
"start": "node server.js"
}
}
</syntaxhighlight>
#* To restart after a code change, run <code>toolforge webservice {{toolforge latest image|node}} restart</code>.
# Find your pod's name by running <code>kubectl get pods</code>.
# Use the pod name to check your pod's logs <code>kubectl logs -f $MY_POD_NAME</code>.
# PROFIT! :)
=== Running npm with toolforge webservice shell ===
To use an up-to-date version of node, e.g. for installing dependencies, run:
# <code>toolforge webservice {{toolforge latest image|node}} shell</code>
# <code>cd $HOME/www/js</code>
# <code>npm install</code>
== Using other versions of node ==
If you need to use other versions of node, you can use [https://github.com/creationix/nvm nvm] or a similar tool to install node versions locally.
To activate the version, define the <code>start</code> property of the <code>scripts</code> object in your <code>package.json</code> file to activate the needed version before starting your app. In its simplest form it could look like <code>"scripts": {"start":"nvm run node server.js"}</code>.
== Troubleshooting ==
*If you run into errors doing <code>npm install</code>, try <code>LINK=g++ npm install</code>
*If you can't access the <code>kubectl</code> executable, could it be that you [[Help:Toolforge/Web#Running_npm_with_webservice_shell|started a webservice shell]] and didn't <code>exit</code> it?
{{:Help:Cloud Services communication}}
==See also==
*[[Help:Toolforge/Deno]]
*[[Help:Toolforge/Web]]
* [[Help:Toolforge/My first NodeJS OAuth tool]]
[[Category:How-to-guide|Node]]
[[Category:Node.js]]
88ulhlt7tnoprk5vuh9tnp419252fxl
Map of database maintenance
0
449160
2421408
2421326
2026-05-31T00:01:22Z
Dexbot
30554
Bot: Updating the report
2421408
wikitext
text/x-wiki
{{/Header}}
== Today (2026-05-31) ==
== Yesterday (2026-05-30) ==
== Last seven days ==
{| class="wikitable"
|+ eqiad
|-
! Section !! Work
|-
| pc3 || [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui)
|-
| pc4 || [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui)
|-
| s1 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s2 ||
* [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
* [[phab:T425622|Switchover s2 master (db1222 -> db1162) (T425622)]] (fceratto)
|-
| s6 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s8 ||
* [[phab:T419635|Drop il_to column from imagelinks table in wmf production (T419635)]] (fceratto)
* [[phab:T426095|Switchover s8 master (db1209 -> db1193) (T426095)]] (fceratto)
* [[phab:T426633|Login (T426633)]] (fceratto)
|-
| x1 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| x3 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
|}
{| class="wikitable"
|+ codfw
|-
! Section !! Work
|-
| pc3 || [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui)
|-
| pc4 || [[phab:T418973|Productionize pc20[21-24] and pc10[21-24] (T418973)]] (marostegui)
|-
| s1 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s2 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s3 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s4 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s5 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s6 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s7 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| s8 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
| x1 || [[phab:T426633|Login (T426633)]] (fceratto)
|-
|}
[[Category:MariaDB]]
74v3d6nf5qptkuv3xek7estfttof4ip
Obsolete:Analytics/Archive/EventLogging/Data retention
110
451958
2421355
2259932
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention]]: pages are obsolete
2198591
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention]]
ref2skyp95soto738tt04ioyi76knr2
Obsolete:Analytics/Archive/EventLogging/Data retention/AppInstallId
110
451959
2421357
2259934
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention/AppInstallId]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention/AppInstallId]]: pages are obsolete
2198592
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention/AppInstallId]]
icsxot2rofrwv3qotx36gy80y9wq2v6
Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging
110
451960
2421359
2259936
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention and auto-purging]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging]]: pages are obsolete
2198593
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention]]
ref2skyp95soto738tt04ioyi76knr2
Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId
110
451961
2421361
2259938
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId]]: pages are obsolete
2198594
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention/AppInstallId]]
icsxot2rofrwv3qotx36gy80y9wq2v6
Obsolete:Analytics/Archive/EventLogging/How to
110
451963
2421365
2266486
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/How to]] to [[Obsolete:Analytics/Archive/EventLogging/How to]]: pages are obsolete
2266486
wikitext
text/x-wiki
#REDIRECT [[Analytics/Archive/EventLogging/Administration#How Tos]]
iariffsys9ycmob509245twylpmk52d
2421518
2421365
2026-05-31T09:23:24Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421518
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration#How Tos]]
jtj92pzy0a76sqa6yjslz4k2bvfmnhj
Obsolete:Analytics/Archive/EventLogging/New pipeline
110
451965
2421369
2259946
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/New pipeline]] to [[Obsolete:Analytics/Archive/EventLogging/New pipeline]]: pages are obsolete
2046828
wikitext
text/x-wiki
#REDIRECT [[Analytics/Archive/EventLogging pipeline]]
shrtt6t6y4gx8tbdte7pveq3ou9k5y1
Obsolete:Analytics/Archive/EventLogging/Oncall
110
451967
2421373
2266487
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Oncall]] to [[Obsolete:Analytics/Archive/EventLogging/Oncall]]: pages are obsolete
2266487
wikitext
text/x-wiki
#REDIRECT [[Analytics/Archive/EventLogging/Administration]]
gs0pgwtt47zvavqoc6i8f0g44a0tgik
2421519
2421373
2026-05-31T09:23:25Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421519
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Obsolete:Analytics/Archive/EventLogging/TestingOnBetaLabs
110
451975
2421389
2266488
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/TestingOnBetaLabs]] to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaLabs]]: pages are obsolete
2266488
wikitext
text/x-wiki
#REDIRECT [[Analytics/Archive/EventLogging/TestingOnBetaCluster]]
lg4jp8slwi8akw402tb2r66lw6qaq4w
2421520
2421389
2026-05-31T09:23:26Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421520
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Obsolete talk:Analytics/Archive/EventLogging/Data retention
111
451979
2421395
2259974
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Data retention]] to [[Obsolete talk:Analytics/Archive/EventLogging/Data retention]]: pages are obsolete
2198886
wikitext
text/x-wiki
#REDIRECT [[Talk:Data Platform/Systems/Event Data retention]]
ftxupil1ga4lxpt6tkgkh9sbfmmj5p6
Obsolete talk:Analytics/Archive/EventLogging/Data retention and auto-purging
111
451980
2421397
2259976
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Data retention and auto-purging]] to [[Obsolete talk:Analytics/Archive/EventLogging/Data retention and auto-purging]]: pages are obsolete
2198887
wikitext
text/x-wiki
#REDIRECT [[Talk:Data Platform/Systems/Event Data retention]]
ftxupil1ga4lxpt6tkgkh9sbfmmj5p6
Nova Resource:Tools.cluebotng-review/SAL
498
452890
2421407
2421087
2026-05-30T18:17:29Z
Stashbot
7414
wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26691257350 (https://github.com/cluebotng/component-configs/commits/1daccd14ff5fe952e32175ee2cf249f2312d99ae)
2421407
wikitext
text/x-wiki
=== 2026-05-30 ===
* 18:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26691257350 (https://github.com/cluebotng/component-configs/commits/1daccd14ff5fe952e32175ee2cf249f2312d99ae)
=== 2026-05-29 ===
* 00:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26609639469 (https://github.com/cluebotng/component-configs/commits/8c2fccaaae357774084389157d9a305e72eccb20)
=== 2026-05-28 ===
* 18:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26592792853 (https://github.com/cluebotng/component-configs/commits/a7971b7e286e177862e5318c40b0d4d868efc7c8)
=== 2026-05-21 ===
* 20:40 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26251627549 (https://github.com/cluebotng/component-configs/commits/96f9184e66a6e4b35a49f02940a213125945b056)
=== 2026-05-19 ===
* 00:18 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/26068076396 (https://github.com/cluebotng/component-configs/commits/f7db7f6fff0d4d6dd451b5f92e75ba755a74129c)
=== 2026-05-17 ===
* 08:58 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25986406220 (https://github.com/cluebotng/component-configs/commits/be3bb145d2803394cd0b7dbd8ae1775ac9b7cd09)
=== 2026-05-14 ===
* 18:49 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25878752531 (https://github.com/cluebotng/component-configs/commits/21e928fa1870ddaf5fae15afc6f92aa3cb3fb970)
=== 2026-05-13 ===
* 02:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25773616037 (https://github.com/cluebotng/component-configs/commits/0fd601991775a24b437113d09438e74b996c991b)
=== 2026-05-12 ===
* 10:14 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25727821525 (https://github.com/cluebotng/component-configs/commits/91aefb7d53013ad152bb721f71980dd26170f297)
* 09:16 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25724842068 (https://github.com/cluebotng/component-configs/commits/8bc931f8c1f1c93df322457a7abadec867f9f46c)
* 09:08 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25724562210 (https://github.com/cluebotng/component-configs/commits/bd0e188642746ab949ec3762676ac730afff1c17)
* 08:43 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/25723480598 (https://github.com/cluebotng/component-configs/commits/25c0a1035daa67c2225c0f7f7a414ff5cfb6ed2a)
* 08:42 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25723280529 (https://github.com/cluebotng/component-configs/commits/51d7c1919958a7672895885cbb3a1061934d2788)
=== 2026-05-06 ===
* 18:38 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25453754970 (https://github.com/cluebotng/component-configs/commits/92f164d1ab158aea1f76cd0a787f33ffe4017e85)
=== 2026-05-02 ===
* 12:59 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/25252392660 (https://github.com/cluebotng/component-configs/commits/7352cd4f730ca9f5c276772f0b338230989feef4)
=== 2026-04-24 ===
* 22:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24914278528 (https://github.com/cluebotng/component-configs/commits/23a4b53f3d291b0c750d44a2c0a661333307786d)
=== 2026-04-20 ===
* 23:22 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24695320410 (https://github.com/cluebotng/component-configs/commits/279edf060f43353ea66e6d057773bfdb883b16a1)
=== 2026-04-17 ===
* 00:14 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24540678937 (https://github.com/cluebotng/component-configs/commits/26849735bbefbe218cbe0ce41db5a35941798c7b)
=== 2026-04-14 ===
* 21:08 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24422836848 (https://github.com/cluebotng/component-configs/commits/10f4f0f81e169fac55d056176a273966c8160078)
=== 2026-04-11 ===
* 10:29 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24280508671 (https://github.com/cluebotng/component-configs/commits/5953d3fb9c5e414df6995740382b7bd3be49ced2)
* 10:21 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24280372486 (https://github.com/cluebotng/component-configs/commits/5b34645dff3f37bc9f974635e03cd6b8436f37d1)
=== 2026-04-10 ===
* 23:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24268430819 (https://github.com/cluebotng/component-configs/commits/3652893dce02243971055a6ab740363f103ce104)
* 23:04 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24268031378 (https://github.com/cluebotng/component-configs/commits/31367659ada078f50022f1df4b16b6139db27c09)
* 22:50 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24267603054 (https://github.com/cluebotng/component-configs/commits/ba252b54cec9387b47dd4ac4a347d4a9c5118c3e)
* 16:08 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24251572443 (https://github.com/cluebotng/component-configs/commits/2426c8db99c6d44c954ced07c9f41fcaa9e8e549)
* 15:53 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24251572443 (https://github.com/cluebotng/component-configs/commits/2426c8db99c6d44c954ced07c9f41fcaa9e8e549)
* 15:26 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24250442439 (https://github.com/cluebotng/component-configs/commits/bfa8b761a017e9b8bb69ae52c5cb731d17bd324f)
* 15:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24249898021 (https://github.com/cluebotng/component-configs/commits/68514222ba9a90ece524baf75b02c9835faf87d3)
* 14:27 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24247620609 (https://github.com/cluebotng/component-configs/commits/e63a941f5b83d97a9751af731c869062ceef4519)
* 14:26 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24247581365 (https://github.com/cluebotng/component-configs/commits/49becfde53d5f960c8e4df0484cebb2bb4d4c5aa)
* 13:59 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24246395413 (https://github.com/cluebotng/component-configs/commits/945fa198e64a0e63b777bb570d57d68ef0ce3f69)
* 13:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24245816710 (https://github.com/cluebotng/component-configs/commits/6181fdda40150d3535541f3084ac7ff245f19536)
* 13:36 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24245353001 (https://github.com/cluebotng/component-configs/commits/97eebf1bcdf5be901e0d3fd82c1b3ea6a8668163)
* 13:31 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24245099622 (https://github.com/cluebotng/component-configs/commits/251c10040c01caf2ba9b855050c318d5d2fd8e81)
* 13:27 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24244959273 (https://github.com/cluebotng/component-configs/commits/2a6605ee2d07c0ff0d690aaa8aabed0ca35bab72)
* 04:34 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24226442413 (https://github.com/cluebotng/component-configs/commits/4f895f83dae3f356cae2a1bbcfea51dd9d18bd15)
* 01:21 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24221486227 (https://github.com/cluebotng/component-configs/commits/d96804861818d7786153d18d47be075a4dbbb6f2)
=== 2026-04-09 ===
* 20:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24212235180 (https://github.com/cluebotng/component-configs/commits/1cd21afab7312bd0122c0e735f8f4dca03019011)
* 19:13 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24208407251 (https://github.com/cluebotng/component-configs/commits/6cd680dd209bf7fbb01cf24cb6cca82f0fab716d)
* 18:32 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24206712296 (https://github.com/cluebotng/component-configs/commits/e7d5ec988541b9d441a5c565f624b7e88e11204f)
* 18:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/24206093220 (https://github.com/cluebotng/component-configs/commits/a97bfe791582e24f1c696f1bd89b965ea233c253)
* 14:20 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24195131449 (https://github.com/cluebotng/component-configs/commits/6b512f6db7cc4e49078b135e437185906821ae81)
=== 2026-04-08 ===
* 05:03 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24118645397 (https://github.com/cluebotng/component-configs/commits/908dd70b5972cca0c0dafbe50a0020547b833a4e)
=== 2026-04-07 ===
* 22:08 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/24106708812 (https://github.com/cluebotng/component-configs/commits/b85f56b6997ccf41cc8ea32f33a61809b68b9bc5)
=== 2026-04-02 ===
* 22:11 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/23924337776 (https://github.com/cluebotng/component-configs/commits/266152f9f673810b0c9460b5828cb86e7aee31d9)
=== 2026-03-31 ===
* 06:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23782957628 (https://github.com/cluebotng/component-configs/commits/7888bbd75773dc064d78ad2ee8949f1540eab0fd)
* 01:21 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23775778726 (https://github.com/cluebotng/component-configs/commits/47f8e20c39e29b952f6dbbd04917970802ce1a0b)
=== 2026-03-27 ===
* 17:45 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/23659656811 (https://github.com/cluebotng/component-configs/commits/f4a494492433360a06326a918985c51c6d0828d4)
* 17:43 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23659537327 (https://github.com/cluebotng/component-configs/commits/c3f980e28e95bd1081b2ed9c903d2ac4d51b2c3b)
=== 2026-03-23 ===
* 10:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23433070550 (https://github.com/cluebotng/component-configs/commits/dd92649311ee430b4225d5c6db5d6e6b16d10a86)
* 10:40 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23433054689 (https://github.com/cluebotng/component-configs/commits/c895b3d11b8546e54e4cca5ba350c0a5ca9c5917)
=== 2026-03-21 ===
* 16:26 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/23383565693 (https://github.com/cluebotng/component-configs/commits/48390b500ab2b65905e09987c12a3e42c3f69778)
* 16:23 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23383640335 (https://github.com/cluebotng/component-configs/commits/3497a25c3d209bdf8f64f3ec3e77e52f2f8debfa)
* 16:19 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/23383565693 (https://github.com/cluebotng/component-configs/commits/48390b500ab2b65905e09987c12a3e42c3f69778)
* 16:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23383551619 (https://github.com/cluebotng/component-configs/commits/ffff74b90a37a0c6bdd565128d3c11ae195e0763)
=== 2026-03-20 ===
* 04:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23329334324 (https://github.com/cluebotng/component-configs/commits/bd7700c30291bfab3a656aa8f257292e287a71ca)
=== 2026-03-19 ===
* 09:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23287809902 (https://github.com/cluebotng/component-configs/commits/0976850451c9fbb8c4afb773cc70b91cd7c6fdeb)
=== 2026-03-17 ===
* 21:34 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23217273642 (https://github.com/cluebotng/component-configs/commits/4fe6cb3d8ad39b60746b3b6bd2f83c4d05a82d6b)
=== 2026-03-13 ===
* 01:01 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/23031229526 (https://github.com/cluebotng/component-configs/commits/ebd67e60183f161276bf0e13daab55ceb2463eb2)
=== 2026-03-10 ===
* 00:45 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22881598726 (https://github.com/cluebotng/component-configs/commits/bc32d8044077ff83db8b985b87df029ff564ad29)
=== 2026-03-07 ===
* 00:53 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22788149115 (https://github.com/cluebotng/component-configs/commits/b3731fab9a7f4f225ecbe318fa80808de6c904b0)
=== 2026-03-06 ===
* 09:08 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22756671151 (https://github.com/cluebotng/component-configs/commits/397fc33968a3c4795b97b1791a0b991ebeb81430)
=== 2026-03-04 ===
* 09:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22662750872 (https://github.com/cluebotng/component-configs/commits/01746ef8804c30c85963ea888a75887ebe879e3b)
* 01:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22650516003 (https://github.com/cluebotng/component-configs/commits/e7a1e2e06f2ccf038c06cb203369f336c298cf6c)
=== 2026-03-03 ===
* 21:09 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22642685900 (https://github.com/cluebotng/component-configs/commits/3cbfb68b3c0e7d97130ede1be762389f300234d2)
=== 2026-03-02 ===
* 01:04 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22557173037 (https://github.com/cluebotng/component-configs/commits/a414cf552e0a0c0d2c9e9817f922d56a4c899bf6)
=== 2026-02-27 ===
* 01:18 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22468544596 (https://github.com/cluebotng/component-configs/commits/b961f37db0544196a7206882b8e3f2292b7e0894)
=== 2026-02-25 ===
* 21:10 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22415923749 (https://github.com/cluebotng/component-configs/commits/a2a4f5ecffad1b49c96c33b5045430a5b75f71bc)
* 11:57 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22395760068 (https://github.com/cluebotng/component-configs/commits/c6093c4ed72aba8fa453b2f67e48d1effeaabb4b)
=== 2026-02-24 ===
* 21:29 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22370719324 (https://github.com/cluebotng/component-configs/commits/7170b95a5f9b6be3c928684f1e9c436deb3ddd1f)
* 00:46 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22331436693 (https://github.com/cluebotng/component-configs/commits/372e84511fdcb0893755ac22f399d3f24f438f7b)
=== 2026-02-21 ===
* 05:46 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22251332525 (https://github.com/cluebotng/component-configs/commits/29401fbb166f71e375eca7254fe841cb01836d2f)
=== 2026-02-20 ===
* 13:46 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22226385604 (https://github.com/cluebotng/component-configs/commits/fbd7d861a7062a2c09fd2117cbf569beb53916f4)
* 08:34 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22216991615 (https://github.com/cluebotng/component-configs/commits/64d521535aa35454c28900f70009efc0e9ff4a10)
* 05:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22212024170 (https://github.com/cluebotng/component-configs/commits/9b0508c1c5a875dd795c865e67f2a93d4f247597)
=== 2026-02-19 ===
* 13:12 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22183076688 (https://github.com/cluebotng/component-configs/commits/919ebb8860a93b9d071da361cad56448a3b1f2b4)
* 00:53 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22163899784 (https://github.com/cluebotng/component-configs/commits/0ce51e9bda73cc3ee0df647f7ba8dcfd02eb97e6)
=== 2026-02-16 ===
* 02:14 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22047780823 (https://github.com/cluebotng/component-configs/commits/3a1f6b151d38aab4ce1a62509b108ac9afc5230b)
=== 2026-02-15 ===
* 08:39 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22032600290 (https://github.com/cluebotng/component-configs/commits/80b9cda20d3f21e2f901db6ccbd168bfffb6b063)
=== 2026-02-14 ===
* 20:42 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22023900270 (https://github.com/cluebotng/component-configs/commits/e8878c3f7a08aa1712126c1b6490f6db41621f44)
* 20:28 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22023724371 (https://github.com/cluebotng/component-configs/commits/de3779f7adea66769077e2380d7b0ce25f3d9e82)
* 20:27 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22023714306 (https://github.com/cluebotng/component-configs/commits/7cd082664340738b5c6cc46d0a195f3814672a3a)
* 19:42 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22022830785 (https://github.com/cluebotng/component-configs/commits/9b3a727405218dd32b8f5b5d34d8906fe1ba840c)
* 19:36 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22022830785 (https://github.com/cluebotng/component-configs/commits/9b3a727405218dd32b8f5b5d34d8906fe1ba840c)
* 19:17 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22022830785 (https://github.com/cluebotng/component-configs/commits/9b3a727405218dd32b8f5b5d34d8906fe1ba840c)
* 19:16 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/22022827162 (https://github.com/cluebotng/component-configs/commits/0e1693c2b662aaa0c9264ceef355bcbfbc162ea7)
* 19:01 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22022587196 (https://github.com/cluebotng/component-configs/commits/57dcc675b3ed54fc17f697d6c7b9554b5d06aab0)
* 18:52 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/22022444685 (https://github.com/cluebotng/component-configs/commits/1e0b17c59284d25ea8ac39a455abb9921ee6608a)
=== 2025-11-26 ===
* 19:58 wm-bot2: Deployment failed: https://github.com/cluebotng/component-configs/actions/runs/19715882591 (https://github.com/cluebotng/component-configs/commits/18c2bc79b5f0023e682a9245197cf87c5cc76943)
=== 2025-11-11 ===
* 15:39 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19270642915 (https://github.com/cluebotng/component-configs/commits/3fe913812986e82db75d4a6657cba3f697f5649c)
* 15:27 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19270263370 (https://github.com/cluebotng/component-configs/commits/d1674e8f4f6cec3b48e848137ce42585278d4a67)
=== 2025-11-09 ===
* 22:22 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19215285730 (https://github.com/cluebotng/component-configs/commits/bf77359fc102b05a026ea8b66dc01ff16a936804)
* 22:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19215085734 (https://github.com/cluebotng/component-configs/commits/6d8f2491239fbe29d19544922253d9930a88e7a0)
* 20:30 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19213996489 (https://github.com/cluebotng/component-configs/commits/38bc77281c9dbd1100915d95ba68705d8a7392a7)
* 20:25 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19213940731 (https://github.com/cluebotng/component-configs/commits/c01b89b7b0455d4f1cc63a2eb002f9c55c0a663f)
=== 2025-11-05 ===
* 19:57 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19114484110 (https://github.com/cluebotng/component-configs/commits/fae01bfaeaeca0cf7676ece10cbd39948560086f)
* 16:29 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/19108874377 (https://github.com/cluebotng/component-configs/commits/3f51ec3aa53d1378883a9dc973716e57c283d26c)
=== 2025-10-29 ===
* 15:19 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18912872633 (https://github.com/cluebotng/component-configs/commits/3281794d8d1d2e17d9e9859c6f6f7ae3c5216eda)
=== 2025-10-23 ===
* 12:32 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18748267282 (https://github.com/cluebotng/component-configs/commits/bc8f1b883d0d53edf08bea5e5319ee7ee0b4fb82)
=== 2025-10-07 ===
* 06:48 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18304366702 (https://github.com/cluebotng/component-configs/commits/5b83bca0e9293029698d7f3a1b2764727ae7f971)
=== 2025-10-06 ===
* 06:49 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18272390911 (https://github.com/cluebotng/component-configs/commits/49abbdd5dd7066314199c213043305ceed2b54f7)
=== 2025-10-05 ===
* 06:43 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18255184611 (https://github.com/cluebotng/component-configs/commits/7fe1a04069d9d0b4b11019443c85885c202852d4)
=== 2025-10-03 ===
* 06:47 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18215013747 (https://github.com/cluebotng/component-configs/commits/7ab2bbe022e2513dc81a13a7055c4c7736e5f876)
=== 2025-09-29 ===
* 16:41 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18104101417 (https://github.com/cluebotng/component-configs/commits/c49408a6e0285932adef0b5cc39e15d06c8742f5)
* 15:50 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18102721922 (https://github.com/cluebotng/component-configs/commits/87ddcf2fce928fde2ba91ecdba3561b12b8de1d2)
* 14:16 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18099932067 (https://github.com/cluebotng/component-configs/commits/0de901e1203dd61656503ef2127efe360e9ed6cc)
* 09:18 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18091994310 (https://github.com/cluebotng/component-configs/commits/ff3951fa5af87196929a9a864f8189b7a7436ac8)
* 09:14 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18091898160 (https://github.com/cluebotng/component-configs/commits/ff3951fa5af87196929a9a864f8189b7a7436ac8)
=== 2025-09-27 ===
* 13:08 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18060157048 (https://github.com/cluebotng/component-configs/commits/3aa079ed0cb7aa29f9ece46a47ad96203e53f242)
=== 2025-09-26 ===
* 22:18 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18050560602 (https://github.com/cluebotng/component-configs/commits/886ded0824a9ce7b27c852949f3530bda15bef14)
* 11:55 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18036958606 (https://github.com/cluebotng/component-configs/commits/a51fe109bfad3e2df5aa8e89b837a951bf8ad2cf)
* 06:47 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18030114498 (https://github.com/cluebotng/component-configs/commits/ea47ef95beb4cf8a1b7d439a83af7b2d4cf168ce)
=== 2025-09-25 ===
* 17:47 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18016018062 (https://github.com/cluebotng/component-configs/commits/150020d96f0c95173ba88c382221223a0c1f7a8d)
* 17:44 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18015998409 (https://github.com/cluebotng/component-configs/commits/5592cdfcdc7e683a993c8e784d83fb1a71a0b04c)
* 16:56 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18014801858 (https://github.com/cluebotng/component-configs/commits/4f92189a79e68827f38e9a6a233b20c02529e77c)
* 16:55 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18014528718 (https://github.com/cluebotng/component-configs/commits/96654b441f84901e1a607ced407eb9babb8fdbfc)
* 16:45 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18014528718 (https://github.com/cluebotng/component-configs/commits/96654b441f84901e1a607ced407eb9babb8fdbfc)
* 16:33 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18014221959 (https://github.com/cluebotng/component-configs/commits/b0737b89fc85c164c5a869aff21421ba21af2e4d)
* 16:16 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18013782292 (https://github.com/cluebotng/component-configs/commits/7e1eb9e3c9a52e0dd71cc58dc797183236a1c27e)
* 16:12 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18013677351 (https://github.com/cluebotng/component-configs/commits/371029d320611d8be6103da43ce9e0a91a2f8e1a)
* 16:07 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18013531088 (https://github.com/cluebotng/component-configs/commits/9a6dc9f53f08ea206e75ad75ddddc3429e1e004f)
* 15:34 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18012641376 (https://github.com/cluebotng/component-configs/commits/87c176492b1f1fb18570dbb70687258843c5773c)
* 14:17 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/18008240418 (https://github.com/cluebotng/component-configs/commits/9949a4a5acff374c1edd7b6e21959a28721e02d0)
=== 2025-09-24 ===
* 17:58 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17985198575 (https://github.com/cluebotng/component-configs/commits/cfa2541734b05a9da326bbeab2e82cc21d6e91e4)
* 17:40 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17984820837 (https://github.com/cluebotng/component-configs/commits/6f47ae931d95d85e2c3c1d6b42f1eabc6d3b1960)
* 17:06 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17984009157 (https://github.com/cluebotng/component-configs/commits/refs/heads/main)
* 16:55 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17983740752 (https://github.com/cluebotng/component-configs/commits/refs/heads/main)
* 12:52 wm-bot2: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17977211158 (https://github.com/cluebotng/component-configs/commits/refs/heads/main)
* 12:11 wmbot~component-configs@tools-bastion: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17976182663
* 12:06 wmbot~component-configs@tools-bastion: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17976062312
* 06:46 wmbot~component-configs@tools-bastion: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17968643293
=== 2025-09-22 ===
* 20:43 wmbot~component-configs@tools-bastion: Test migrating log to feed channel
* 20:43 wmbot~damian-scripts@tools-bastion-15: Test migrating log to feed channel
* 19:12 wmbot~damian-scripts@tools-bastion-15: Deployment completed: https://github.com/cluebotng/component-configs/actions/runs/17925793114
=== 2025-08-24 ===
* 18:02 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.3.0
=== 2025-08-23 ===
* 19:44 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.2.5
* 18:37 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.2.4
* 18:10 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.2.3
=== 2025-08-14 ===
* 13:39 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.2.2
* 13:27 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.2.1
* 13:20 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.2.0
=== 2025-08-13 ===
* 20:07 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.1.10
* 18:53 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.1.9
=== 2025-08-11 ===
* 19:49 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.1.7
=== 2025-08-10 ===
* 18:19 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.1.6
* 15:06 wmbot~damian-scripts@tools-bastion-13: reviewer deployed @ refs/tags/v0.1.5
=== 2025-08-07 ===
* 15:38 wmbot~damian@tools-bastion-13: reviewer deployed @ refs/tags/v0.1.2
=== 2023-06-15 ===
* 18:43 wm-bot: <root> webservice restart, checked pods
<noinclude>[[Category:SAL]]</noinclude>
lj8etyxhk8nsxf89z3fnh6dwebnw1ck
Data Engineering/Systems/EventLogging
0
454326
2421454
2266527
2026-05-31T09:22:07Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421454
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Data Engineering/Systems/EventLogging/Administration
0
454327
2421455
2266528
2026-05-31T09:22:08Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421455
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Data Engineering/Systems/EventLogging/Architecture
0
454328
2421456
2266529
2026-05-31T09:22:09Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
2421456
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
563zn02ep8hvymxagfoiuv51v07t6wn
Data Engineering/Systems/EventLogging/Backfilling
0
454329
2421457
2266530
2026-05-31T09:22:11Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
2421457
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
ixtg113mywcmt0chk7x1xzr8d83eidp
Data Engineering/Systems/EventLogging/Data representations
0
454330
2421458
2266531
2026-05-31T09:22:12Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
2421458
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
5e27mterkyfcdw6nxfeaiwk2jadfz6n
Data Engineering/Systems/EventLogging/EventCapsule
0
454331
2421459
2266532
2026-05-31T09:22:13Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
2421459
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
ekk4tm809d58l8akk4jfh8oiu6c67qj
Data Engineering/Systems/EventLogging/Monitoring
0
454332
2421461
2266535
2026-05-31T09:22:15Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
2421461
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
car9duvgoadqdlvtghdnzp1qalu0ntt
Data Engineering/Systems/EventLogging/NotErrorLogging
0
454333
2421462
2266536
2026-05-31T09:22:17Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
2421462
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
ei45szt89ywomi0djg2wti5uaglaiec
Data Engineering/Systems/EventLogging/Outages
0
454334
2421464
2266538
2026-05-31T09:22:19Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Outages]]
2421464
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Data Engineering/Systems/EventLogging/Performance
0
454335
2421465
2266539
2026-05-31T09:22:20Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Performance]]
2421465
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Performance]]
qjrz2uod4uyospuds419rd1uxhkdq3p
Data Engineering/Systems/EventLogging/Publishing
0
454336
2421466
2266540
2026-05-31T09:22:21Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
2421466
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
nmbj4rbsh37le6a3rekioqs0hzpkwp1
Data Engineering/Systems/EventLogging/Sanitization vs Aggregation
0
454337
2421467
2266541
2026-05-31T09:22:23Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421467
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
ozb5iyz2xjb7w6y5xu3x21aukwi414q
Data Engineering/Systems/EventLogging/Schema Guidelines
0
454338
2421468
2266542
2026-05-31T09:22:24Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
2421468
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
ds2ev3dw8t6jov0i0t75yawt9o5qhae
Data Engineering/Systems/EventLogging/Sensitive Fields
0
454339
2421469
2266543
2026-05-31T09:22:25Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
2421469
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
k9nkblptw4pz7l3cvunzjff64jjqh0w
Data Engineering/Systems/EventLogging/TestingOnBetaCluster
0
454340
2421470
2266544
2026-05-31T09:22:26Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421470
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Data Engineering/Systems/EventLogging/User agent sanitization
0
454341
2421472
2266546
2026-05-31T09:22:29Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
2421472
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
lq89feu0hym2433yawxpz27l176t1h0
Talk:Data Engineering/Systems/EventLogging
1
454342
2421509
2266571
2026-05-31T09:23:13Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421509
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Talk:Data Engineering/Systems/EventLogging/Administration
1
454343
2421510
2266572
2026-05-31T09:23:15Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
2421510
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
d8aoaqa0dou1rrttrh352hsovjjill0
Talk:Data Engineering/Systems/EventLogging/Sanitization vs Aggregation
1
454344
2421511
2266573
2026-05-31T09:23:16Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421511
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
lzpyfc1dpi1gsu0axf5047tshmgp3fh
Analytics/Systems/EventLogging
0
455011
2421435
2266507
2026-05-31T09:21:43Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421435
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Analytics/Systems/EventLogging/Administration
0
455012
2421436
2266508
2026-05-31T09:21:45Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421436
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/Systems/EventLogging/Architecture
0
455013
2421437
2266509
2026-05-31T09:21:46Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
2421437
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
563zn02ep8hvymxagfoiuv51v07t6wn
Analytics/Systems/EventLogging/Backfilling
0
455014
2421438
2266510
2026-05-31T09:21:47Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
2421438
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
ixtg113mywcmt0chk7x1xzr8d83eidp
Analytics/Systems/EventLogging/Data representations
0
455015
2421439
2266511
2026-05-31T09:21:48Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
2421439
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
5e27mterkyfcdw6nxfeaiwk2jadfz6n
Analytics/Systems/EventLogging/EventCapsule
0
455020
2421440
2266512
2026-05-31T09:21:49Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
2421440
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
ekk4tm809d58l8akk4jfh8oiu6c67qj
Analytics/Systems/EventLogging/How to
0
455021
2421441
2266513
2026-05-31T09:21:51Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421441
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration#How Tos]]
jtj92pzy0a76sqa6yjslz4k2bvfmnhj
Analytics/Systems/EventLogging/Monitoring
0
455022
2421442
2266514
2026-05-31T09:21:52Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
2421442
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
car9duvgoadqdlvtghdnzp1qalu0ntt
Analytics/Systems/EventLogging/NotErrorLogging
0
455024
2421443
2266515
2026-05-31T09:21:53Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
2421443
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
ei45szt89ywomi0djg2wti5uaglaiec
Analytics/Systems/EventLogging/Oncall
0
455025
2421444
2266516
2026-05-31T09:21:55Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421444
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/Systems/EventLogging/Outages
0
455026
2421445
2266517
2026-05-31T09:21:56Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Outages]]
2421445
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Analytics/Systems/EventLogging/Performance
0
455027
2421446
2266518
2026-05-31T09:21:57Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Performance]]
2421446
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Performance]]
qjrz2uod4uyospuds419rd1uxhkdq3p
Analytics/Systems/EventLogging/Publishing
0
455028
2421447
2266519
2026-05-31T09:21:58Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
2421447
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
nmbj4rbsh37le6a3rekioqs0hzpkwp1
Analytics/Systems/EventLogging/Sanitization vs Aggregation
0
455029
2421448
2266520
2026-05-31T09:22:00Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421448
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
ozb5iyz2xjb7w6y5xu3x21aukwi414q
Analytics/Systems/EventLogging/Schema Guidelines
0
455030
2421449
2266521
2026-05-31T09:22:01Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
2421449
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
ds2ev3dw8t6jov0i0t75yawt9o5qhae
Analytics/Systems/EventLogging/Sensitive Fields
0
455031
2421450
2266522
2026-05-31T09:22:02Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
2421450
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
k9nkblptw4pz7l3cvunzjff64jjqh0w
Analytics/Systems/EventLogging/TestingOnBetaCluster
0
455032
2421451
2266523
2026-05-31T09:22:03Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421451
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Analytics/Systems/EventLogging/TestingOnBetaLabs
0
455033
2421452
2266524
2026-05-31T09:22:05Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421452
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Analytics/Systems/EventLogging/User agent sanitization
0
455034
2421453
2266525
2026-05-31T09:22:06Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
2421453
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
lq89feu0hym2433yawxpz27l176t1h0
Talk:Analytics/Systems/EventLogging
1
455076
2421506
2266568
2026-05-31T09:23:10Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421506
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Talk:Analytics/Systems/EventLogging/Administration
1
455077
2421507
2266569
2026-05-31T09:23:11Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
2421507
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
d8aoaqa0dou1rrttrh352hsovjjill0
Talk:Analytics/Systems/EventLogging/Sanitization vs Aggregation
1
455080
2421508
2266570
2026-05-31T09:23:12Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421508
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
lzpyfc1dpi1gsu0axf5047tshmgp3fh
Data Platform/Systems/EventLogging
0
456383
2421473
2259923
2026-05-31T09:22:30Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging]]
2421473
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Data Platform/Systems/EventLogging/Administration
0
456384
2421474
2259925
2026-05-31T09:22:31Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421474
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Data Platform/Systems/EventLogging/Architecture
0
456385
2421475
2259927
2026-05-31T09:22:32Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
2421475
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
563zn02ep8hvymxagfoiuv51v07t6wn
Data Platform/Systems/EventLogging/Backfilling
0
456386
2421476
2259929
2026-05-31T09:22:33Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
2421476
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
ixtg113mywcmt0chk7x1xzr8d83eidp
Data Platform/Systems/EventLogging/Data representations
0
456387
2421477
2259931
2026-05-31T09:22:35Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
2421477
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
5e27mterkyfcdw6nxfeaiwk2jadfz6n
Data Platform/Systems/EventLogging/EventCapsule
0
456392
2421478
2259941
2026-05-31T09:22:36Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
2421478
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
ekk4tm809d58l8akk4jfh8oiu6c67qj
Data Platform/Systems/EventLogging/How to
0
456393
2421479
2266552
2026-05-31T09:22:37Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421479
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration#How Tos]]
jtj92pzy0a76sqa6yjslz4k2bvfmnhj
Data Platform/Systems/EventLogging/Monitoring
0
456394
2421480
2259945
2026-05-31T09:22:38Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
2421480
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
car9duvgoadqdlvtghdnzp1qalu0ntt
Data Platform/Systems/EventLogging/NotErrorLogging
0
456396
2421481
2259949
2026-05-31T09:22:39Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
2421481
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
ei45szt89ywomi0djg2wti5uaglaiec
Data Platform/Systems/EventLogging/Oncall
0
456397
2421482
2266554
2026-05-31T09:22:41Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421482
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Data Platform/Systems/EventLogging/Outages
0
456398
2421483
2259953
2026-05-31T09:22:42Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Outages]]
2421483
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Data Platform/Systems/EventLogging/Performance
0
456399
2421484
2259955
2026-05-31T09:22:43Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Performance]]
2421484
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Performance]]
qjrz2uod4uyospuds419rd1uxhkdq3p
Data Platform/Systems/EventLogging/Publishing
0
456400
2421485
2259957
2026-05-31T09:22:44Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
2421485
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
nmbj4rbsh37le6a3rekioqs0hzpkwp1
Data Platform/Systems/EventLogging/Sanitization vs Aggregation
0
456401
2421486
2259959
2026-05-31T09:22:45Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421486
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
ozb5iyz2xjb7w6y5xu3x21aukwi414q
Data Platform/Systems/EventLogging/Schema Guidelines
0
456402
2421487
2259961
2026-05-31T09:22:47Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
2421487
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
ds2ev3dw8t6jov0i0t75yawt9o5qhae
Data Platform/Systems/EventLogging/Sensitive Fields
0
456403
2421488
2259963
2026-05-31T09:22:48Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
2421488
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
k9nkblptw4pz7l3cvunzjff64jjqh0w
Data Platform/Systems/EventLogging/TestingOnBetaCluster
0
456404
2421489
2259965
2026-05-31T09:22:49Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421489
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Data Platform/Systems/EventLogging/TestingOnBetaLabs
0
456405
2421490
2266555
2026-05-31T09:22:50Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421490
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Data Platform/Systems/EventLogging/User agent sanitization
0
456406
2421491
2259969
2026-05-31T09:22:51Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
2421491
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
lq89feu0hym2433yawxpz27l176t1h0
Talk:Data Platform/Systems/EventLogging
1
456407
2421512
2259971
2026-05-31T09:23:17Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging]]
2421512
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Talk:Data Platform/Systems/EventLogging/Administration
1
456408
2421513
2259973
2026-05-31T09:23:18Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
2421513
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
d8aoaqa0dou1rrttrh352hsovjjill0
Talk:Data Platform/Systems/EventLogging/Sanitization vs Aggregation
1
456411
2421514
2259979
2026-05-31T09:23:19Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
2421514
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
lzpyfc1dpi1gsu0axf5047tshmgp3fh
User:Ottomata/Event Platform Producing Events
2
460238
2421341
2026-05-30T13:05:48Z
Ottomata
88
initial draft
2421341
wikitext
text/x-wiki
= Event Platform - Producer Tutorial =
This page is a basic overview of how event streams can be produced events using WMF's Event Platform.
== Overview ==
To produce an event, the following must be satisfied:
# An event JSONSchema in one of the [[labsconsole:Event_Platform/Schemas/Guidelines#WMF_Schema_Repositories|WMF event schema repositories]].
# A stream declared in [[labsconsole:Event_Platform/Stream_Configuration|Event Stream Configuration]].
# Valid events produced via one of:
* EventLogging JS API (browser client side)
* EventBus MediaWiki PHP API (server side)
* Directly to an [[labsconsole:Event_Platform/EventGate|EventGate]]
* Directly to Kafka (if you know what you are doing :) )
== Event Schemas ==
Event Platform uses JSONSchema for event schemas.
Event schemas are stored in one of the [[Event_Platform/Schemas/Guidelines#WMF_Schema_Repositories|WMF event schema repositories]]. Event schemas are served either from https://schema.wikimedia.org, or from git checkouts of a schema repository. Event schemas are identified by their relative directory path in these repositories and are semantically versioned, e.g. /my_namespace/thing_happened/1.0.0. An event schema's `$id` field must match the schema's relative path. Event JSONSchemas are stored in YAML format.
When an event is produced, its `$schema` field must match the `$id` field of its event schema.
A minimal event schema might look like:
<syntaxhighlight lang="json">
TODO
</syntaxhighlight>
An event schema must minimally have these [[Event_Platform/Schemas/Guidelines#Required_fields|required fields]]:
* `$schema`
* `meta.stream`
* `dt`
== EventStreamConfig ==
[http://Event_Platform/Stream_Configuration Event stream configuration] is stored in MediaWiki config and served via a MediaWiki Action API endpoint.
WMF production stream configs are declared in the `wgEventStreams` mediawiki config, in ext-[https://gerrit.wikimedia.org/r/plugins/gitiles/operations/mediawiki-config/%2B/refs/heads/master/wmf-config/ext-EventStreamConfig.php EventStreamConfig.php] (TODO: link).
A minimal EventStreamConfig entry should look something like:
<syntaxhighlight lang="php">
'wgEventStreams' => [
// ...
'my_namespace.thing_happened' => [
'schema_title' => 'my_namespace/thing_happened',
'destination_event_service' => 'eventgate-analytics-external',
],
];
</syntaxhighlight>
`schema_title` is how a stream is associated with its event schema.
`destination_event_service` names the EventGate instance the event stream is allowed to be produced to.
If you've made these changes in ext-EventStreamConfig.php, you'll need to schedule a [[Backport windows|Backport window deployment]]. See [[Deployments]] and [[Backport windows]] for instructions.
Additionally, EventStreamConfig has some overridable defaults set. When using the EventStreamConfig API, these defaults will be merged in. A minimal full default stream configuration API response will have more values set.
You can request a specific stream's settings from the EventStreamConfig Action API like
<syntaxhighlight lang="bash">
curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=eventgate-main.test.event' | jq .
</syntaxhighlight>
== Producing events ==
Once you have a schema and a stream declared, you can start producing events.
A minimal event should look something like:
<syntaxhighlight lang="json">
{
"$schema": "/my_namespace/thing_happened/1.0.0",
"meta": {
"stream": "my_namespace.thing_happened"
},
"dt": "2026-05-29T12:34:56Z",
"some_field": "value"
}
</syntaxhighlight>
* The `meta.stream` field indicates the name of the stream dataset. This must match your entry in EventStreamConfig.
* The `$schema` field is the URI path to the event's schema at a specific version.
* The `dt` field represents the event time: the time at which the event happens.
Events can be produced in any of the following ways.
* Client side in JavaScript, using the MediaWiki EventLogging extension JavaScript API.
* Server side in PHP, using the MediaWiki EventBus extensions PHP API.
* HTTP POSTed to an EventGate endpoint.
* Manually to Kafka. This option is not recommended unless [[labsconsole:Event_Platform/Producer_Requirements|you know what you are doing]].
NOTE: The MediaWiki APIs each ultimately POST events to EventGate.
=== Server side - MediaWiki PHP ===
Example:
<syntaxhighlight lang="php">
use MediaWiki\Deferred\DeferredUpdates;
use MediaWiki\Extension\EventBus\EventBus;
DeferredUpdates::addCallableUpdate( static function () {
$event = [
'$schema' => '/my_namespace/thing_happened/1.0.0',
'meta' => [
'stream' => 'my_namespace.thing_happened',
],
'dt' => wfTimestamp( TS_ISO_8601 ),
'page_id' => 12345,
'action' => 'edited',
];
EventBus::getInstanceForStream( 'my_namespace.thing_happened' )
->send( [ $event ] );
} );
</syntaxhighlight>
=== Client side - MediaWiki JavaScript ===
When producing events from MediaWiki browser clients, use the EventLogging JS API.
{{Note|Analytics instrumentation within MediaWiki without pre-existing Event Platform instrumentation should use the [[Test Kitchen]] to create new instrumentation.}}
You'll need to configure the EventLogging extension to be able to produce your stream. Edit ext-EventLogging.php in mediawiki-config.
<syntaxhighlight lang="php">
'wgEventLoggingStreamNames' => [
'default' => [
// ...
'my_namespace.thing_happened',
],
],
</syntaxhighlight>
If you've made these changes in ext-EventLogging.php (TODO: link), you'll need to schedule a [[Backport windows|Backport window deployment]]. See [[Deployments]] and [[Backport windows]] for instructions.
Once your config is deployed, you can produce events using MediaWiki JS like:
<syntaxhighlight lang="javascript">
mw.eventLog.submit( 'my_namespace.thing_happened', {
$schema: '/my_namespace/thing_happened/1.0.0',
dt: '2026-05-29T12:34:56Z',
page_id: mw.config.get( 'wgArticleId' ),
action: 'clicked',
target: 'example-widget'
} );
</syntaxhighlight>
Note that the stream name is provided as the first parameter. It will be set in your event as `meta.stream`.
=== HTTP Post to EventGate ===
<syntaxhighlight lang="bash">
echo '
{
"$schema": "/my_namespace/thing_happened/1.0.0",
"meta": {
"stream": "my_namespace.thing_happened"
},
"dt": "2026-05-29T12:34:56Z",
"some_field": "value"
}
' | curl -d @- https://intake-analytics.wikimedia.org/v1/events
</syntaxhighlight>
NOTE: The actual EventGate endpoint to use will vary depending on your stream's purpose and where it is produced from.
== Viewing produced events ==
Using events for prodution features or analytics is a complex topic and won't be covered here. However, you can easily view or query your events directly using Data Platform tools.
=== EventStreams service ===
Live events can be viewed using the EventStreams UI at https://stream-internal.wikimedia.org.
If your event is published publicly (see below), it will also be available in the public EventStreams service at https://stream.wikimedia.org.
=== Data Lake Hive tables ===
By default, all event streams are ingested into the Data Lake (TODO link) in Hive (TODO link) tables. The Hive tables are named corresponding to a normalized version of your stream name, e.g. `event.my_namespace_thing_happened`.
An event should be ingested into its corresponding Hive table within a few hours. Once there, events can be queried with with various tools, including Spark SQL or Presto SQL, via Jupyter Notebooks, Superset SQLLab, etc.
To learn more about how to access data in the Data Lake, see TODO LINK HERE.
=== Kafka ===
Events are produced to Kafka topics. You can consume events directly from Kafka using kafkacat (AKA kcat) or any other Kafka consumer client.
You'll need to know which Kafka topic and which Kafka cluster to consume from.
The Kafka topics that underly your event stream can be fetched from the EventStreamConfig Action API:
<syntaxhighlight lang="bash">
curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=<your_stream_name_here>' | jq .
# ...
"schema_title": "my_namespace/thing_happened",
"destination_event_service": "eventgate-analytics-external",
"stream": "my_namespace.thing_happened",
"topics": [
"eqiad.eventgate-main.test.event",
"codfw.eventgate-main.test.event"
]
}
}
}
</syntaxhighlight>
WMF operates several [[Kafka#Kafka_Clusters|Kafka clusters]]. Your event stream may be on several of these Kafka clusters, but most event streams are mirrored to the Kafka jumbo-eqiad cluster and are available there.
Choose at least one of the broker hostnames in the target Kafka cluster.
==== kafkacat from a [stat box](TODO link) ====
<syntaxhighlight lang="bash">
# Consume:
# - -u: unbuffered
# - -b: from Kafka jumbo-eqiad cluster
# - -t: from topic eqiad.<your_stream_name_here>
# - -o end: from the end of the topic
# (and pipe into jq for JSON formatting.)
kafkacat -C -u -b kafka-jumbo1010.eqiad.wmnet:9092 -t eqiad.<your_stream_name_here> -o end | jq .
</syntaxhighlight>
== Exposing events ==
Event streams that are produced to the Kafka main clusters (TODO: link) can be exposed for public consumption via the EventStreams HTTP API at https://stream.wikimedia.org. If you are sure your event stream contains no PII and does not have a privacy or security risk (TODO: link to security review), you can edit its EventStream configuration to expose it.
Edit the `allowed_streams` setting in [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/eventstreams/values.yaml deployment-charts helmfile.d/services/eventstreams/values.yaml]:
<syntaxhighlight lang="yaml">
config:
# ...
allowed_streams:
# ...
- <your_stream_name_here>
</syntaxhighlight>
Get the patch reviewed and deployed, and your stream will be consumable publicly. It will also show up in the public [EventStreams API docs](https://stream-internal.wikimedia.org/?doc#/streams).
== Local MediaWiki development with EventGate and docker compose ==
For local development, you'll need a local EventGate. If you use MediaWiki Docker Compose, you can use the [[mw:MediaWiki-Docker/Configuration_recipes/EventGate|EventGate Compose recipe]].
The EventGate devserver listens on port 8192 and validates events using its configured schema repository.
Basic EventGate development configuration does not use EventStreamConfig and will allow all event streams to be produced.
You can test that EventGate is working with a simple HTTP request:
<syntaxhighlight lang="bash">
curl http://localhost:8192/v1/_test/events
</syntaxhighlight>
EventGate devserver is sometimes configured to write events to a local file (TODO: location).
=== Configuring MediaWiki to use EventGate devserver ===
Clone the necessary extensions into your extensions/ directory.
* [[mw:Extension:EventBus|EventBus]] - required for MediaWiki server side
* [[mw:Extension:EventLogging|EventLogging]] - required for MediaWiki client side and for MediaWiki server side EventLogging use.
* [[mw:Extension:EventStreamConfig|EventStreamConfig]] - optional. Only if you want to use EventStreamConfig locally. This will require custom configuration.
Add the following to your LocalSettings.php:
<syntaxhighlight lang="bash">
wfLoadExtensions( [
// Uncomment whichever extensions you are using
// 'EventBus',
// 'EventStreamConfig',
// 'EventLogging',
] );
// If using EventBus:
if ( !defined( 'MW_PHPUNIT_TEST' ) ) {
$wgEventServices = [
'default' => [
'url' => 'http://eventgate-dev:8192/v1/events',
],
];
$wgEventServiceDefault = 'default';
$wgEnableEventBus = 'TYPE_EVENT';
}
</syntaxhighlight>
=== Schema development ===
To develop event schemas, you'll need:
* a local checkout of a schema repository,
* EventGate configured to use the local checkout.
Edit the docker-compose eventgate service to point to a local custom eventgate config file (see https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/blob/master/config.dev.yaml). Edit the config file and set the `schema_base_uris` to a local path to your schema repository checkout and restart eventgate.
== Learn more ==
* [[Event_Platform|Event Platform]]
* [[Event_Platform/Schemas|Event Platform/Schemas]]
* [[Event_Platform/Schemas/Guidelines|Event Platform/Schemas/Guidelines]]
* [[Event_Platform/Stream_Configuration|Event Platform/Stream Configuration]]
* [[Event_Platform/Producer_Requirements|Event Platform/Producer Requirements]]
* [[mw:MediaWiki-Docker/Configuration_recipes/EventGate|MediaWiki-Docker: EventGate recipe]]
* [[Hadoop_Event_Ingestion_Lifecycle|Data Lake Event Ingestion]]
* [[Event_Platform/EventLogging_legacy|Event Platform vs EventLogging legacy]]
rbu5p94ugrty7y9uw82iemzqyjolc8h
2421342
2421341
2026-05-30T13:46:52Z
Ottomata
88
draft 2
2421342
wikitext
text/x-wiki
= Event Platform - Producer Tutorial =
This page is a basic overview of how event streams can be produced events using WMF's Event Platform.
== Overview ==
To produce an event, the following must be satisfied:
# An event JSONSchema in one of the [[labsconsole:Event_Platform/Schemas/Guidelines#WMF_Schema_Repositories|WMF event schema repositories]].
# A stream declared in [[labsconsole:Event_Platform/Stream_Configuration|Event Stream Configuration]].
# Valid events produced via one of:
#* EventLogging JS API (browser client side)
#* EventBus MediaWiki PHP API (server side)
#* Directly to an [[labsconsole:Event_Platform/EventGate|EventGate]]
#* Directly to Kafka (if you know what you are doing :) )
== Event Schemas ==
Event Platform uses JSONSchema for event schemas.
WMF Event schemas
* are stored in one of the [[Event_Platform/Schemas/Guidelines#WMF_Schema_Repositories|WMF event schema repositories]].
* are stored in YAML format.
* are served either from https://schema.wikimedia.org, or from git checkouts of a schema repository.
* are semantically versioned, e.g. <code>/my_namespace/thing_happened/1.0.0.yaml</code>
* are located using a schema URI that matches their relative directory path in these repositories, e.g. '<code>/my_namespace/thing_happened/1.0.0.yaml'</code>
* have an <code>$id</code> field that matches the schema's relative directory path, e.g. <code>/my_namespace/thing_happened/1.0.0</code>
* have a <code>title</code> field that matches the schema's relative path, without the leading '/'. e.g. <code>my_namespace/thing_happened</code>
A minimal event schema might look like:
<syntaxhighlight lang="yaml">
title: my_namespace/thing_happened
description: An event about when thing happened
$id: /my_namespace/thing_happened/1.0.0
$schema: 'https://json-schema.org/draft-07/schema#'
type: object
additionalProperties: false
required:
- $schema
- meta
- dt
properties:
$schema:
description: >
A URI identifying the JSONSchema for this event. This should match an
schema's $id in a schema repository. E.g. /schema/title/1.0.0
type: string
dt:
description: >
ISO-8601 formatted timestamp of when the event occurred/was generated in
UTC), AKA 'event time'. This is different than meta.dt, which is used as
the time the system received this event.
type: string
format: date-time
maxLength: 128
meta:
type: object
required:
- stream
properties:
stream:
description: Name of the stream (dataset) that this event belongs in
type: string
minLength: 1
some_field:
description: the value of the thing that happened
type: string
</syntaxhighlight>
An event schema must minimally have these [[Event_Platform/Schemas/Guidelines#Required_fields|required fields]]:
* <code>$schema</code> - set to the <code>$id</code> value of the event's schema.
* <code>meta.stream</code> - the stream's dataset name
* <code>dt</code> - event time: a UTC date-time string in ISO-8601 format
{{Note|text=This is a minimal example of a 'materialized' event JSONSchema. WMF event schema repositories use [https://gitlab.wikimedia.org/repos/data-engineering/jsonschema-tools jsonschema-tools] for DRYer schemas and to manage the schema lifecycle. You can learn more at [[Event_Platform/Schemas]] and [[Event_Platform/Schemas/Guidelines]], as well as in the READMEs of the WMF schema repositories.}}
== EventStreamConfig ==
[[Event Platform/Stream Configuration|Event stream configuration]] is stored in MediaWiki config and served via a MediaWiki Action API endpoint.
WMF production stream configs are declared in the <code>wgEventStreams</code> mediawiki config, in [[gerrit:plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/ext-EventStreamConfig.php|ext-EventStreamConfig.php]].
A minimal stream config entry should look something like:
<syntaxhighlight lang="php">'wgEventStreams' => [
// ...
'my_namespace.thing_happened' => [
'schema_title' => 'my_namespace/thing_happened',
'destination_event_service' => 'eventgate-analytics-external',
],
];</syntaxhighlight>
<code>schema_title</code> associates an event stream with its event schema. This must match exactly the <code>title</code> field of the schema.
<code>destination_event_service</code> names the EventGate instance the event stream is allowed to be produced to.
If you've made changes to ext-EventStreamConfig.php, you'll need to schedule a [[Backport windows|Backport window deployment]]. See [[Deployments]] and [[Backport windows]] for instructions.
Additionally, <code>wgEventStreamDefaults</code> adds common overridable defaults for all streams. When using the EventStreamConfig API, these defaults will be merged in. A minimal full default stream configuration API response will have more values set.
You can request a specific stream's settings from the EventStreamConfig Action API like
<syntaxhighlight lang="bash">curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=eventgate-main.test.event' | jq .</syntaxhighlight>
[[mw:Extension:EventBus|EventStreamConfig]] is implemented in a MediaWiki Extension.
== Producing events ==
Once you have a schema and a stream declared, you can start producing events.
A minimal event should look something like:
<syntaxhighlight lang="json">
{
"$schema": "/my_namespace/thing_happened/1.0.0",
"meta": {
"stream": "my_namespace.thing_happened"
},
"dt": "2026-05-29T12:34:56Z",
"some_field": "value"
}
</syntaxhighlight>
* The <code>meta.stream</code> field indicates the name of the stream dataset. This must match your stream name entry in <code>wgEventStreams</code>.
* The <code>$schema</code> field is the URI path to the event's schema at a specific version.
* The <code>dt</code> field represents the event time: the time at which the event happens.
Events can be produced in any of the following ways.
* Client side in JavaScript, using the MediaWiki EventLogging extension JavaScript API.
* Server side in PHP, using the MediaWiki EventBus extensions PHP API.
* HTTP POSTed to an EventGate endpoint.
* Manually to Kafka. This option is not recommended unless [[labsconsole:Event_Platform/Producer_Requirements|you know what you are doing]].
{{Note|text=These MediaWiki APIs POST events to EventGate.}}
=== Server side - MediaWiki PHP ===
Example:
<syntaxhighlight lang="php">
use MediaWiki\Deferred\DeferredUpdates;
use MediaWiki\Extension\EventBus\EventBus;
DeferredUpdates::addCallableUpdate( static function () {
$event = [
'$schema' => '/my_namespace/thing_happened/1.0.0',
'meta' => [
'stream' => 'my_namespace.thing_happened',
],
'dt' => wfTimestamp( TS_ISO_8601 ),
'page_id' => 12345,
'action' => 'edited',
];
EventBus::getInstanceForStream( 'my_namespace.thing_happened' )
->send( [ $event ] );
} );
</syntaxhighlight>
=== Client side - MediaWiki JavaScript ===
When producing events from MediaWiki browser clients, use the EventLogging JS API.
{{Note|Analytics instrumentation within MediaWiki without pre-existing Event Platform instrumentation should use the [[Test Kitchen]] to create new instrumentation.}}
You'll need to configure the EventLogging extension to be able to produce your stream. Add your stream to <code>wgEventLoggingStreamNames</code> in [[gerrit:plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/ext-EventLogging.php|ext-EventLogging.php]] in mediawiki-config.
<syntaxhighlight lang="php">
'wgEventLoggingStreamNames' => [
'default' => [
// ...
'my_namespace.thing_happened',
],
],
</syntaxhighlight>
If you've made these changes in ext-EventLogging.php, you'll need to schedule a [[Backport windows|Backport window deployment]]. See [[Deployments]] and [[Backport windows]] for instructions.
Once your config is deployed, you can produce events using MediaWiki JS like:
<syntaxhighlight lang="javascript">
mw.eventLog.submit( 'my_namespace.thing_happened', {
$schema: '/my_namespace/thing_happened/1.0.0',
dt: '2026-05-29T12:34:56Z',
page_id: mw.config.get( 'wgArticleId' ),
action: 'clicked',
target: 'example-widget'
} );
</syntaxhighlight>
Note that the stream name is provided as the first parameter. It will be set in your event as `meta.stream`.
=== HTTP Post to EventGate ===
<syntaxhighlight lang="bash">
echo '
{
"$schema": "/my_namespace/thing_happened/1.0.0",
"meta": {
"stream": "my_namespace.thing_happened"
},
"dt": "2026-05-29T12:34:56Z",
"some_field": "value"
}
' | curl -d @- https://intake-analytics.wikimedia.org/v1/events
</syntaxhighlight>{{Note|text=The actual EventGate endpoint to use will vary depending on your stream's purpose and where it is produced from.}}
== Viewing produced events ==
Using events for production features or analytics is a complex topic and won't be covered here. However, you can view or query your events directly using [[Data Platform]] tools.
=== [[Event Platform/EventStreams|EventStreams]] service ===
Live events can be viewed using the EventStreams UI at https://stream-internal.wikimedia.org.
If your event is [[User:Ottomata/Event Platform Producing Events#Exposing events|exposed publicly]] (see below), it will also be available in the public EventStreams service at https://stream.wikimedia.org.
=== Data Lake Hive tables ===
By default, all event streams are ingested into [[Data Platform/Data Lake|Data Lake]] [[Data Platform/Data Lake/SQL|SQL tables]]. The tables are named corresponding to a normalized version of your stream name, e.g. <code>event.my_namespace_thing_happened</code>.
An event should be ingested into its corresponding Data Lake table within a few hours. Once there, events can be queried with with various tools, including Spark SQL or Presto SQL, via Jupyter Notebooks, Superset SQLLab, etc.
=== [[Kafka]] ===
Events are produced to Kafka topics. You can consume events directly from Kafka using <code>kafkacat</code> (AKA <code>kcat</code>) or any other Kafka consumer client.
You'll need to know which Kafka topic and which Kafka cluster to consume from.
The Kafka topics that underly your event stream can be fetched from the EventStreamConfig Action API:
<syntaxhighlight lang="bash">
curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=my_namespace.thing_happened' | jq .
# ...
"schema_title": "my_namespace/thing_happened",
"destination_event_service": "eventgate-analytics-external",
"stream": "my_namespace.thing_happened",
"topics": [
"eqiad.eventgate-main.test.event",
"codfw.eventgate-main.test.event"
]
}
}
}
</syntaxhighlight>
WMF operates several [[Kafka#Kafka_Clusters|Kafka clusters]]. Your event stream may be on several of these Kafka clusters, but most event streams are mirrored to the Kafka jumbo-eqiad cluster and are available there.
Choose at least one of the broker hostnames in the target Kafka cluster.
==== kafkacat from a [[Data Platform/Systems/Stat hosts|stat box]] ====
<syntaxhighlight lang="bash">
# -C: consume
# -u: unbuffered
# -b: from Kafka jumbo-eqiad cluster
# -t: topic eqiad.my_namespace.thing_happened
# -o end: starting at the end of the topic
# (and pipe into jq for JSON formatting.)
kafkacat -C -u -b kafka-jumbo1010.eqiad.wmnet:9092 -t eqiad.my_namespace.thing_happened -o end | jq .
</syntaxhighlight>
== Exposing events ==
Event streams that are produced to the Kafka main clusters can be exposed for public consumption via the EventStreams HTTP API at https://stream.wikimedia.org.
If you are sure your event stream contains no PII and does not have a privacy or security risk (TODO: link), you can expose it.
Edit the <code>allowed_streams</code> setting in [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/eventstreams/values.yaml|deployment-charts helmfile.d/services/eventstreams/values.yaml]]:
<syntaxhighlight lang="yaml">
config:
# ...
allowed_streams:
# ...
- my_namespace.thing_happened
</syntaxhighlight>
Get the patch reviewed and deployed, and your stream will be consumable publicly. It will also show up in the public [https://stream-internal.wikimedia.org/?doc#/streams EventStreams API docs].
== Local MediaWiki development with EventGate and docker compose ==
For local development, you'll need a local EventGate. If you use MediaWiki Docker Compose, you can use the [[mw:MediaWiki-Docker/Configuration_recipes/EventGate|EventGate Compose recipe]].
The EventGate devserver listens on port 8192 and validates events using its configured schema repository.
Basic EventGate development configuration does not use EventStreamConfig and will allow all event streams to be produced.
You can test that EventGate is working with a simple HTTP request:
<syntaxhighlight lang="bash">
curl http://localhost:8192/v1/_test/events
</syntaxhighlight>
EventGate devserver is sometimes configured to write events to a local file (TODO: location).
=== Configuring MediaWiki to use EventGate devserver ===
Clone the necessary extensions into your extensions/ directory.
* [[mw:Extension:EventBus|EventBus]] - required for MediaWiki server side
* [[mw:Extension:EventLogging|EventLogging]] - required for MediaWiki client side and for server side EventLogging analytics events.
* [[mw:Extension:EventStreamConfig|EventStreamConfig]] - optional. Only if you want to use EventStreamConfig locally. This will require custom configuration.
Add the following to your LocalSettings.php:
<syntaxhighlight lang="bash">
wfLoadExtensions( [
// Uncomment whichever extensions you are using
// 'EventBus',
// 'EventStreamConfig',
// 'EventLogging',
] );
// If using EventBus:
if ( !defined( 'MW_PHPUNIT_TEST' ) ) {
$wgEventServices = [
'default' => [
'url' => 'http://eventgate-dev:8192/v1/events',
],
];
$wgEventServiceDefault = 'default';
$wgEnableEventBus = 'TYPE_EVENT';
}
</syntaxhighlight>
=== Schema development ===
To develop event schemas, you'll need:
* a local checkout of a schema repository,
* EventGate configured to use the local checkout.
Edit the docker-compose eventgate service to point to a local custom eventgate config file (see https://gitlab.wikimedia.org/repos/data-engineering/eventgate-wikimedia/-/blob/master/config.dev.yaml). Edit the config file and set the <code>schema_base_uris</code> to a local path to your schema repository checkout and restart eventgate.
== Learn more ==
* [[Event_Platform|Event Platform]]
* [[Event_Platform/Schemas|Event Platform/Schemas]]
* [[Event_Platform/Schemas/Guidelines|Event Platform/Schemas/Guidelines]]
* [[Event_Platform/Stream_Configuration|Event Platform/Stream Configuration]]
* [[Event_Platform/Producer_Requirements|Event Platform/Producer Requirements]]
* [[mw:MediaWiki-Docker/Configuration_recipes/EventGate|MediaWiki-Docker: EventGate recipe]]
* [[Data Lake/Hadoop Event Ingestion Lifecycle|Data Lake Event Ingestion]]
* [[Event_Platform/EventLogging_legacy|Event Platform vs EventLogging legacy]]
ofk6besa32sfg28848wn50s4gecgj6s
2421521
2421342
2026-05-31T10:14:10Z
Ottomata
88
2421521
wikitext
text/x-wiki
{{Navigation Event Platform}}
This is a guide for producing event streams into WMF's [[Event Platform]]. It walks through the schema and stream config, the available producer clients, local development, deployment, and how to consume the events once they're flowing.
{{Note|For most analytics instrumentation, you should use [[Test Kitchen]].}}
== Overview ==
To produce events to Event Platform, you need:
# An event JSONSchema in one of the [[Event Platform/Schemas/Guidelines#WMF Schema Repositories|WMF schema repositories]].
# A stream declared in [[Event Platform/Stream Configuration|EventStreamConfig]].
# Valid events produced via one of:
#* [[mw:Extension:EventLogging|EventLogging]] (browser JS, or PHP from MediaWiki, for analytics)
#* [[mw:Extension:EventBus|EventBus]] (PHP from MediaWiki, for production features)
#* An HTTP POST to [[Event Platform/EventGate|EventGate]] (any language, any service)
#* Directly to [[Kafka]], if you have a good reason (see [[Event Platform/Producer Requirements]])
Once those are deployed, events flow through EventGate into Kafka, get ingested into the [[Data Platform/Data Lake|Data Lake]] as a Hive table, and may also available for consumption in the [[Event Platform/EventStreams|EventStreams HTTP API]].
== Event Platform requirements ==
=== Event schemas ===
WMF event schemas are written as YAML JSONSchemas, kept in Git, and identified by a versioned URI. They live in one of the [[Event Platform/Schemas/Guidelines#WMF Schema Repositories|WMF event schema repositories]] (e.g. [[gitlab:repos/data-engineering/schemas-event-primary|schemas-event-primary]] for operational production schemas, [[gitlab:repos/data-engineering/schemas-event-secondary|schemas-event-secondary]] for analytics).
A schema's <code>title</code> matches its path in the repository, and its <code>$id</code> is the path with a semver version on the end. For example, <code>jsonschema/my_namespace/thing_happened/1.0.0.yaml</code> might have:
<syntaxhighlight lang="yaml">
title: my_namespace/thing_happened
$id: /my_namespace/thing_happened/1.0.0
</syntaxhighlight>
A minimal unmaterialized current.yaml schema file might like this.
<syntaxhighlight lang="yaml">
title: my_namespace/thing_happened
description: An event about when thing happened
$id: /my_namespace/thing_happened/1.0.0
$schema: https://json-schema.org/draft-07/schema#
type: object
allOf:
- $ref: /fragment/common/2.0.0#
- properties:
some_field:
description: The value of the thing that happened
type: string
</syntaxhighlight>
<code>$ref</code> pulls in required and common fields ([[#Required event fields|see below]]) so you don't have to define them yourself.
For the full schema authoring workflow (including how <code>jsonschema-tools</code> materializes versioned files), see [[Event Platform/Schemas]] and [[Event Platform/Schemas/Guidelines]]. The README in each schema repository also documents its own conventions. ''Read those before creating or modifying a schema.''
Once a schema change is merged, it should be auto deployed to https://schema.wikimedia.org within 30 minutes.
==== Required event fields ====
Every event schema must have these fields:
* <code>$schema</code>: the versioned schema URI. Must match the schema's <code>$id</code>. EventGate uses this to look up which schema to validate against.
* <code>meta.stream</code>: the name of the stream this event belongs to.
* <code>dt</code>: the event timestamp, as an ISO-8601 UTC date-time.
* <code>meta.dt</code>: the system receive timestamp, as an ISO-8601 UTC date-time.
{{Note|Generally, you should not set <code>meta.dt</code> values in your events. EventGate will set this for you.}}
See also [[Event Platform/Schemas/Guidelines#Required fields]]. <- TODO FIX LINK
=== Stream configuration ===
Declare your stream by addin an entry to <code>wggEventStreams</code> in [[gerrit:plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/ext-EventStreamConfig.php|mediawiki-config/wmf-config/ext-EventStreamConfig.php]]. A minimum entry is:
<syntaxhighlight lang="php">
'wgEventStreams' => [
'default' => [
// ...
'my_namespace.thing_happened' => [
'schema_title' => 'my_namespace/thing_happened',
'destination_event_service' => 'eventgate-analytics-external',
],
],
],
</syntaxhighlight>
<code>schema_title</code> must match the schema's <code>title</code> field exactly. This is used to ensure that only events of that schema are allowed in the stream.
<code>destination_event_service</code> names the [[Event Platform/EventGate#EventGate clusters|EventGate cluster]] the event stream is allowed to be produced to.
Other common settings are documented at [[Event Platform/Stream Configuration#Common Settings Documentation|common stream settings]].
==== Deploying stream config ====
# Edit <code>ext-EventStreamConfig.php</code> and get the patch reviewed.
# Schedule a [[Backport windows|Backport window]] to deploy mediawiki-config (or deploy it on your own).
# Verify your stream is in the API: <code>curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=my_namespace.thing_happened' | jq .</code>
# If your stream targets an EventGate cluster that only requests stream configs at startup (check the [[Event Platform/EventGate#EventGate clusters|docs]]), [[Event Platform/EventGate/Administration#Roll restart all pods|ask for that cluster to be roll-restarted]].
To override settings for [[mw:Beta Cluster|beta]] only, edit <code>InitialiseSettings-labs.php</code> instead. See [[#Per-wiki and beta overrides]] below.
== Producing events ==
There are several ways to produce events to Kafka:
* [[mw:Extension:EventBus|EventBus]]'s PHP API: for non-analytics events.
* HTTP POST to EventGate <code>/v1/events</code> endpoint.
* [[mw:Extension:EventLogging|EventLogging]]'s PHP API: for analytics events (this uses EventBus).
* MediaWiki EventLogging JS API: POSTs to an externally exposed EventGate.
* Directly to Kafka - not recommended unless using a supported client library
=== Producing with EventBus ===
<syntaxhighlight lang="php">
use MediaWiki\Deferred\DeferredUpdates;
use MediaWiki\Extension\EventBus\EventBus;
DeferredUpdates::addCallableUpdate( static function () {
$event = [
'$schema' => '/my_namespace/thing_happened/1.0.0',
'meta' => [
'stream' => 'my_namespace.thing_happened',
],
'dt' => wfTimestamp( TS_ISO_8601 ),
'page_id' => 12345,
'action' => 'edited',
];
EventBus::getInstanceForStream( 'my_namespace.thing_happened' )
->send( [ $event ] );
} );
</syntaxhighlight>
<code>EventBus::getInstanceForStream</code> uses your stream's <code>destination_event_service</code> choose the EventGate cluster.
=== Producing to EventGate ===
The POST body can be a single event or an array of events.
<syntaxhighlight lang="bash">
echo '
{
"$schema": "/my_namespace/thing_happened/1.0.0",
"meta": { "stream": "my_namespace.thing_happened" },
"dt": "2026-05-29T12:34:56Z",
"some_field": "value"
}
' | curl -H 'Content-Type: application/json' -d @- \
https://intake-analytics.wikimedia.org/v1/events
</syntaxhighlight>
EventGate has two producer modes 'guaranteed' and 'hasty'. 'guaranteed' is the default. See [[Event Platform/EventGate#Producer types: Guaranteed and Hasty|EventGate producer modes]] for the difference.
== Producing to Kafka ==
You can produce directly to Kafka, but you should do everything EventGate would do: schema lookup, validation, setting <code>meta</code> fields, picking the right Kafka topic, etc. [[Event Platform/Producer Requirements]] explains the producer contract.
[[Event Platform/Event Utilities|wikimedia-event-utilities]] has a Java library for producing Event Platform streams to Kafka. It can be used via pyflink through [https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python eventutilties-python].
Otherwise, you should only produce to Kafka directly if you know what you are doing.
=== MediaWiki local development ===
You'll need a local EventGate 'devserver'. The [[mw:MediaWiki-Docker/Configuration_recipes/EventGate|MediaWiki-Docker EventGate recipe]] adds an <code>eventgate</code> service to your <code>docker-compose.override.yml</code>.
Point EventBus at it by adding this to <code>LocalSettings.php</code>:
<syntaxhighlight lang="php">
wfLoadExtension( 'EventBus' );
$wgEventServices = [
'default' => [
'url' => 'http://eventgate:8192/v1/events',
],
];
$wgEventServiceDefault = 'default';
$wgEnableEventBus = 'TYPE_EVENT';
</syntaxhighlight>
The EventGate devserver fetches schemas from <code>https://schema.wikimedia.org</code> by default, and accepts events for any stream (no EventStreamConfig required locally). If you also want to validate against a schema you are currently developing, mount a local checkout of the schema repository into the container and point EventGate at it via the <code>schema_base_uris</code> setting in its config file. (The EventLogging recipe linked below shows that pattern.)
If you don't have MediaWiki Docker running yet, see [https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/master/DEVELOPERS.md DEVELOPERS.md] in mediawiki/core for setup.
== Producing analytics events ==
{{Note|For most analytics instrumentation, you should use [[Test Kitchen]].}}
From a technical perspective, analytics events are not special. However, because they have the potential to collect sensitive data, they require some special care.
<section begin="plan-experiment" />Consult the [[foundation:Legal:Data_Collection_Guidelines|Data Collection Guidelines]] before starting instrumentation to determine which risk tier your planned data collection activity falls under. If it is Low Risk, you do not need to submit a request for approval. If it is Medium or High Risk, you need to submit a request through [https://office.wikimedia.org/wiki/Legal,_Safety_%26_Security_Service_Center L3SC]. For affiliates (such as WMDE working on features and instruments that are deployed on Foundation infrastructure) who cannot access L3SC, please submit a request through the [[mediawikiwiki:Data_Platform_Engineering/Intake_Process|Data Platform Engineering intake process]] so that someone from DPE can submit a request to L3SC on your behalf. It is recommended to perform this step before starting instrumentation because during the review process you may learn that you cannot collect certain data you were planning to collect, so you will save yourself time by not writing code that you will have to remove. For more information about what your request should contain, refer to [[meta:User:MPopov_(WMF)/Sandbox/Measurement_plans_and_instrumentation_specifications|this draft guide on measurement plans and instrumentation specifications]].<section end="plan-experiment" />
Analytics schemas live in [[gitlab:repos/data-engineering/schemas-event-secondary|schemas-event-secondary]] under the <code>analytics</code> namespace. The schema authoring workflow is the same as for any other Event Platform schema. See [[Event Platform/Schemas]].
If not using Test Kitchen, analytics events usually are produced through EventLogging PHP or JS APIs.
=== Registering the stream with EventLogging ===
The EventLogging JS client also needs to know it should produce the stream. Add the stream name to <code>wgEventLoggingStreamNames</code> in [[gerrit:plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/ext-EventLogging.php|ext-EventLogging.php]]:
<syntaxhighlight lang="php">
'wgEventLoggingStreamNames' => [
'default' => [
// ...
'analytics.thing_happened',
],
],
</syntaxhighlight>
=== EventLogging JavaScript ===
<syntaxhighlight lang="javascript">
mw.eventLog.submit( 'analytics.thing_happened', {
// $schema must match the $id of the schema version you're using.
$schema: '/analytics/thing_happened/1.0.0',
dt: new Date().toISOString(),
page_id: mw.config.get( 'wgArticleId' ),
action: 'clicked',
target: 'example-widget'
} );
</syntaxhighlight>
The first argument is the stream name. It must match what's declared in <code>wgEventStreams</code> in production. EventLogging sets <code>meta.stream</code> for you from this value.
=== EventLogging PHP ===
<syntaxhighlight lang="php">
$event = [
'$schema' => '/analytics/thing_happened/1.0.0',
'dt' => wfTimestamp( TS_ISO_8601 ),
'field_a' => 'value_a',
// ...
];
EventLogging::submit( 'analytics.thing_happened', $event );
</syntaxhighlight>
=== MediaWiki local development ===
The [[mw:MediaWiki-Docker/Configuration_recipes/EventLogging|MediaWiki-Docker EventLogging recipe]] is the most direct way to wire up a local EventGate together with EventLogging. It adds an <code>eventlogging</code> service to <code>docker-compose.override.yml</code> running the EventLogging devserver (which bundles EventGate), and gives you a <code>LocalSettings.php</code> snippet that points EventBus and EventLogging at it.
The recipe has two flavors. The minimal one fetches schemas from <code>https://schema.wikimedia.org</code>, which is fine if you are testing instrumentation against existing schemas. The one with local schema repositories mounts clones of <code>schemas-event-primary</code> and <code>schemas-event-secondary</code> into the container so EventGate validates against your in-progress schema work. Use that when you are developing the schema too.
Either way, every event ends up in <code>cache/events.json</code>. <code>tail -f cache/events.json</code> is the quickest way to confirm events are flowing.
Quick smoke test from the browser console:
<syntaxhighlight lang="javascript">
mw.eventLog.submit( 'test.event', {
$schema: '/test/event/1.0.0',
test: 'Hello from JavaScript!'
} );
</syntaxhighlight>
== Viewing produced events ==
Once events are flowing you can read them from [[Event Platform/EventStreams|EventStreams]], from the [[Data Platform/Data Lake|Data Lake]], or directly from Kafka.
=== EventStreams ===
There are three EventStreams instances:
* The public one, https://stream.wikimedia.org, only has streams that have been [[#Exposing events publicly|explicitly exposed]].
* The internal production one, <code>eventstreams-internal.discovery.wmnet</code>, has every stream declared in production stream config. It's not publicly reachable, so you get to it via an SSH tunnel.
* The beta one, https://stream.wikimedia.beta.wmcloud.org, has streams produced in deployment-prep.
To use the internal production instance, tunnel a local port to the discovery service:
<syntaxhighlight lang="bash">
# Tunnel local 4992 to the internal EventStreams service via a bastion.
ssh -N -L 4992:eventstreams-internal.discovery.wmnet:4992 bast1004.wikimedia.org
</syntaxhighlight>
Then open https://localhost:4992/v2/ui/ in your browser for the GUI, or use any EventSource/SSE client against https://localhost:4992/ (see https://localhost:4992/?doc). You'll have to add a security exception for the self-signed certificate.
=== Data Lake ===
Every stream is ingested into the [[Data Platform/Data Lake|Data Lake]] within a few hours. The Hive table name is a normalized version of the stream name, in the <code>event</code> database. Our example stream lands in <code>event.my_namespace_thing_happened</code>. From there you can query with [[Data Platform/Systems/Hive|Hive]], [[Data Platform/Systems/Spark|Spark]] or [[Data Platform/Systems/Presto|Presto]], and dashboard via [[Data Platform/Systems/Superset|Superset]].
Events are retained for 90 days by default. See [[Analytics/Event Sanitization|Event Sanitization]] to extend that.
=== Directly from Kafka ===
Streams are produced into datacenter-prefixed Kafka topics. The <code>my_namespace.thing_happened</code> stream produces to <code>eqiad.my_namespace.thing_happened</code> and <code>codfw.my_namespace.thing_happened</code>. To get the full stream, consume both.
To find out which topics and clusters your stream is on, ask the stream config API:
<syntaxhighlight lang="bash">
curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=my_namespace.thing_happened' | jq .
</syntaxhighlight>
Most streams are mirrored into <code>jumbo-eqiad</code>, which is the easiest place to consume from on a [[Data Platform/Systems/Stat hosts|stat host]]:
<syntaxhighlight lang="bash">
# -C: consume mode
# -u: unbuffered
# -b: broker (any broker in the cluster; it'll discover the rest)
# -t: topic
# -o end: start at the end of the topic (only new events)
kafkacat -C -u -b kafka-jumbo1010.eqiad.wmnet:9092 \
-t eqiad.my_namespace.thing_happened -o end | jq .
</syntaxhighlight>
To pretty-print the last 5 messages instead:
<syntaxhighlight lang="bash">
kafkacat -C -b kafka-jumbo1010.eqiad.wmnet:9092 \
-t eqiad.my_namespace.thing_happened -o -5 -e -q | jq .
</syntaxhighlight>
For [[Test Kitchen]]-based instrumentation, events end up in <code>eqiad.product_metrics.web_base</code> (or a similar Test Kitchen topic), and you filter by <code>instrument_name</code> or <code>experiment.enrolled</code>:
<syntaxhighlight lang="bash">
kafkacat -C -b kafka-jumbo1010.eqiad.wmnet:9092 \
-t eqiad.product_metrics.web_base -o -5 -e -q | \
jq 'select(.instrument_name == "YOUR_INSTRUMENT_NAME")'
</syntaxhighlight>
{{Note|Kafka topic names are case sensitive. Capitalization is whatever you used in <code>wgEventStreams</code>. Note also that [[Event Platform/EventLogging legacy|legacy EventLogging]] streams are named differently and are not split by datacenter, e.g. <code>eventlogging_InukaPageView</code>.}}
== Evolving your schema ==
You can only make ''backwards-compatible'' schema changes, which in practice means adding new optional fields. To add a field, edit <code>current.yaml</code>, bump the version in <code>$id</code> (a minor version bump for an added field), and materialize:
<syntaxhighlight lang="shell-session">
$ npm run build-modified
$ git add jsonschema/analytics/thing_happened/*
$ git commit -m 'analytics/thing_happened - add link_text field, bump to 1.1.0'
</syntaxhighlight>
Then update the producer code to set the new field and the new <code>$schema</code> version URI. <code>jsonschema-tools</code> checks compatibility in CI, but you can run it locally too:
<syntaxhighlight lang="shell-session">
$ npm test
...
Schema Compatibility in Repository ./jsonschema/
analytics/thing_happened
Major Version 1
✓ 1.1.0 must be compatible with 1.0.0
</syntaxhighlight>
Once the new schema is merged and your producer code is deployed, events with the new field will be produced. Old events with the old version keep validating against the old schema.
=== Backwards-incompatible changes ===
In general, backwards-incompatible changes are not allowed, because there is no way to do them without coordination with all consumers. If you really need one, file a Phabricator ticket tagged with #Data-Engineering. The process will be manual and vary depending on the change.
== Per-wiki and beta stream config overrides ==
{{anchor|Per-wiki and beta overrides}}
Stream config is just MediaWiki config, so you can override it per wiki or per wiki group with the standard <code>+wikiname</code> merge syntax. This works for any MediaWiki based usage of stream config settings, e.g. EventLogging, EventBus, etc.
It does ''not'' work for EventGate or Data Lake ingestion settings, because those are is not wiki-aware: it always reads stream config from metawiki. Anything that affects validation or production (<code>schema_title</code>, <code>destination_event_service</code>) has to live in <code>default</code> or in a <code>+metawiki</code> override.
Example: sample 1/10 of events on enwiki only.
For [[mw:Beta Cluster|beta]], the same syntax applies in <code>InitialiseSettings-labs.php</code>, with one catch: the <code>default</code> section in <code>InitialiseSettings-labs.php</code> doesn't merge with <code>default</code> from <code>InitialiseSettings.php</code>. Only per-wiki overrides merge. So if your stream isn't yet declared in production, declare it under <code>+metawiki</code> in <code>InitialiseSettings-labs.php</code> so EventGate (which always reads from metawiki) can see it.
== Schema validation errors ==
Events that fail validation are not produced. Instead, EventGate produces a validation error event into a corresponding <code>*.error.validation</code> stream. Events sent through EventLogging end up in <code>eventgate-analytics-external.error.validation</code>, which is ingested into Hive as <code>event.eventgate_analytics_external_error_validation</code>.
Validation errors are also routed into Logstash. Useful starting points:
* [https://logstash.wikimedia.org/app/dashboards#/view/AXN5OoJu3_NNwgAUlbUT EventGate validation Kibana dashboard]
* [https://grafana.wikimedia.org/goto/x_3wAwJ7k EventGate Grafana dashboard] for per-stream validation error rates
The <code>*.error.validation</code> streams are streams like any other, so you can also subscribe to them from EventStreams.
== Exposing events publicly ==
Streams produced through the Kafka main clusters can be exposed on the public [[Event Platform/EventStreams|EventStreams]] service at https://stream.wikimedia.org. Before you expose anything, '''make sure the stream contains no PII and has been cleared against the [[foundation:Legal:Data_Collection_Guidelines|Data Collection Guidelines]]'''. Once exposed, the stream is consumable by anyone on the internet.
To expose a stream, add it to <code>allowed_streams</code> in [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/eventstreams/values.yaml|deployment-charts/helmfile.d/services/eventstreams/values.yaml]]:
<syntaxhighlight lang="yaml">
config:
# ...
allowed_streams:
# ...
- my_namespace.thing_happened
</syntaxhighlight>
Once that patch is reviewed and deployed, the stream shows up in the public [https://stream.wikimedia.org/?doc EventStreams API docs] and can be consumed at https://stream.wikimedia.org.
== Decommissioning ==
When you set up a stream, plan for how it ends. Schemas '''should not be deleted''', because there may still be older events referencing them in the Data Lake, but you can remove the stream-related code and config at any time to stop producing.
To decommission:
# Stop producing events from your code.
# Remove the stream's entry from <code>wgEventStreams</code> (and <code>wgEventLoggingStreamNames</code> if applicable).
# Mark the stream as decommissioned in its [https://datahub.wikimedia.org/search?filter_platform=urn:li:dataPlatform:eventstreams DataHub] entry.
# If you no longer need the schema, update its <code>description</code> in the schema repository (in a new materialized version) and note the deprecation in its README/CHANGELOG.
If the schema itself has to go (rare), coordinate with Data Engineering so the ingestion pipelines and alerts can be adjusted.
== See also ==
* [[Event Platform]]: concepts, architecture, background
* [[Event Platform/Schemas]] and [[Event Platform/Schemas/Guidelines]]: schema authoring and conventions
* [[Event Platform/Stream Configuration]]: stream config settings reference
* [[Event Platform/Producer Requirements]]: contract for generic producer clients
* [[Event Platform/EventGate]]: the HTTP intake service
* [[Event Platform/EventLogging legacy|EventLogging legacy]]: what changed when we moved off the old EventLogging backend
* [[Data Platform/Systems/Hadoop Event Ingestion Lifecycle|Hadoop Event Ingestion Lifecycle]]: what happens to events after Kafka
[[Category:Event Platform]]
qne1us2hj3dbigbalcxynt4z3kr1pkj
2421522
2421521
2026-05-31T10:19:16Z
Ottomata
88
2421522
wikitext
text/x-wiki
{{Navigation Event Platform}}
This is a guide for producing event streams into WMF's [[Event Platform]]. It walks through the schema and stream config, the available producer clients, local development, deployment, and how to consume the events once they're flowing.
{{Note|For most analytics instrumentation, you should use [[Test Kitchen]].}}
== Overview ==
To produce events to Event Platform, you need:
# An event JSONSchema in one of the [[Event Platform/Schemas/Guidelines#WMF Schema Repositories|WMF schema repositories]].
# A stream declared in [[Event Platform/Stream Configuration|EventStreamConfig]].
# Valid events produced via one of:
#* [[mw:Extension:EventLogging|EventLogging]] (browser JS, or PHP from MediaWiki, for analytics)
#* [[mw:Extension:EventBus|EventBus]] (PHP from MediaWiki, for production features)
#* An HTTP POST to [[Event Platform/EventGate|EventGate]] (any language, any service)
#* Directly to [[Kafka]], if you have a good reason (see [[Event Platform/Producer Requirements]])
Once those are deployed, events flow through EventGate into Kafka, get ingested into the [[Data Platform/Data Lake|Data Lake]] as a Hive table, and may also available for consumption in the [[Event Platform/EventStreams|EventStreams HTTP API]].
== Event Platform requirements ==
=== Event schemas ===
WMF event schemas are written as YAML JSONSchemas, kept in Git, and identified by a versioned URI. They live in one of the [[Event Platform/Schemas/Guidelines#WMF Schema Repositories|WMF event schema repositories]] (e.g. [[gitlab:repos/data-engineering/schemas-event-primary|schemas-event-primary]] for operational production schemas, [[gitlab:repos/data-engineering/schemas-event-secondary|schemas-event-secondary]] for analytics).
A schema's <code>title</code> matches its path in the repository, and its <code>$id</code> is the path with a semver version on the end. For example, <code>jsonschema/my_namespace/thing_happened/1.0.0.yaml</code> might have:
<syntaxhighlight lang="yaml">
title: my_namespace/thing_happened
$id: /my_namespace/thing_happened/1.0.0
</syntaxhighlight>
A minimal unmaterialized current.yaml schema file might like this.
<syntaxhighlight lang="yaml">
title: my_namespace/thing_happened
description: An event about when thing happened
$id: /my_namespace/thing_happened/1.0.0
$schema: https://json-schema.org/draft-07/schema#
type: object
allOf:
- $ref: /fragment/common/2.0.0#
- properties:
some_field:
description: The value of the thing that happened
type: string
</syntaxhighlight>
<code>$ref</code> pulls in required and common fields ([[#Required event fields|see below]]) so you don't have to define them yourself.
For the full schema authoring workflow (including how <code>jsonschema-tools</code> materializes versioned files), see [[Event Platform/Schemas]] and [[Event Platform/Schemas/Guidelines]]. The README in each schema repository also documents its own conventions. ''Read those before creating or modifying a schema.''
Once a schema change is merged, it should be auto deployed to https://schema.wikimedia.org within 30 minutes.
==== Required event fields ====
Every event schema must have these fields:
* <code>$schema</code>: the versioned schema URI. Must match the schema's <code>$id</code>. EventGate uses this to look up which schema to validate against.
* <code>meta.stream</code>: the name of the stream this event belongs to.
* <code>dt</code>: the event timestamp, as an ISO-8601 UTC date-time.
* <code>meta.dt</code>: the system receive timestamp, as an ISO-8601 UTC date-time.
{{Note|Generally, you should not set <code>meta.dt</code> values in your events. EventGate will set this for you.}}
See also [[Event Platform/Schemas/Guidelines#Required fields]]. <- TODO FIX LINK
=== Stream configuration ===
Declare your stream by addin an entry to <code>wggEventStreams</code> in [[gerrit:plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/ext-EventStreamConfig.php|mediawiki-config/wmf-config/ext-EventStreamConfig.php]]. A minimum entry is:
<syntaxhighlight lang="php">
'wgEventStreams' => [
'default' => [
// ...
'my_namespace.thing_happened' => [
'schema_title' => 'my_namespace/thing_happened',
'destination_event_service' => 'eventgate-analytics-external',
],
],
],
</syntaxhighlight>
<code>schema_title</code> must match the schema's <code>title</code> field exactly. This is used to ensure that only events of that schema are allowed in the stream.
<code>destination_event_service</code> names the [[Event Platform/EventGate#EventGate clusters|EventGate cluster]] the event stream is allowed to be produced to.
Other common settings are documented at [[Event Platform/Stream Configuration#Common Settings Documentation|common stream settings]].
==== Deploying stream config ====
# Edit <code>ext-EventStreamConfig.php</code> and get the patch reviewed.
# Schedule a [[Backport windows|Backport window]] to deploy mediawiki-config (or deploy it on your own).
# Verify your stream is in the API: <code>curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=my_namespace.thing_happened' | jq .</code>
# If your stream targets an EventGate cluster that only requests stream configs at startup (check the [[Event Platform/EventGate#EventGate clusters|docs]]), [[Event Platform/EventGate/Administration#Roll restart all pods|ask for that cluster to be roll-restarted]].
To override settings for [[mw:Beta Cluster|beta]] only, edit <code>InitialiseSettings-labs.php</code> instead. See [[#Per-wiki and beta overrides]] below.
== Producing events ==
There are several ways to produce events to Kafka:
* [[mw:Extension:EventBus|EventBus]]'s PHP API: for non-analytics events.
* HTTP POST to EventGate <code>/v1/events</code> endpoint.
* [[mw:Extension:EventLogging|EventLogging]]'s PHP API: for analytics events (this uses EventBus).
* MediaWiki EventLogging JS API: POSTs to an externally exposed EventGate.
* Directly to Kafka - not recommended unless using a supported client library
=== Producing with EventBus ===
<syntaxhighlight lang="php">
use MediaWiki\Deferred\DeferredUpdates;
use MediaWiki\Extension\EventBus\EventBus;
DeferredUpdates::addCallableUpdate( static function () {
$event = [
'$schema' => '/my_namespace/thing_happened/1.0.0',
'meta' => [
'stream' => 'my_namespace.thing_happened',
],
'dt' => wfTimestamp( TS_ISO_8601 ),
'page_id' => 12345,
'action' => 'edited',
];
EventBus::getInstanceForStream( 'my_namespace.thing_happened' )
->send( [ $event ] );
} );
</syntaxhighlight>
<code>EventBus::getInstanceForStream</code> uses your stream's <code>destination_event_service</code> choose the EventGate cluster.
=== Producing to EventGate ===
The POST body can be a single event or an array of events.
<syntaxhighlight lang="bash">
echo '
{
"$schema": "/my_namespace/thing_happened/1.0.0",
"meta": { "stream": "my_namespace.thing_happened" },
"dt": "2026-05-29T12:34:56Z",
"some_field": "value"
}
' | curl -H 'Content-Type: application/json' -d @- \
https://intake-analytics.wikimedia.org/v1/events
</syntaxhighlight>
EventGate has two producer modes 'guaranteed' and 'hasty'. 'guaranteed' is the default. See [[Event Platform/EventGate#Producer types: Guaranteed and Hasty|EventGate producer modes]] for the difference.
=== Producing to Kafka ===
You can produce directly to Kafka, but you should do everything EventGate would do: schema lookup, validation, setting <code>meta</code> fields, picking the right Kafka topic, etc. [[Event Platform/Producer Requirements]] explains the producer contract.
[[Event Platform/Event Utilities|wikimedia-event-utilities]] has a Java library for producing Event Platform streams to Kafka. It can be used via pyflink through [https://gitlab.wikimedia.org/repos/data-engineering/eventutilities-python eventutilties-python].
Otherwise, you should only produce to Kafka directly if you know what you are doing.
=== MediaWiki local development ===
You'll need a local EventGate 'devserver'. The [[mw:MediaWiki-Docker/Configuration_recipes/EventGate|MediaWiki-Docker EventGate recipe]] adds an <code>eventgate</code> service to your <code>docker-compose.override.yml</code>.
Point EventBus at it by adding this to <code>LocalSettings.php</code>:
<syntaxhighlight lang="php">
wfLoadExtension( 'EventBus' );
$wgEventServices = [
'default' => [
'url' => 'http://eventgate:8192/v1/events',
],
];
$wgEventServiceDefault = 'default';
$wgEnableEventBus = 'TYPE_EVENT';
</syntaxhighlight>
The EventGate devserver fetches schemas from <code>https://schema.wikimedia.org</code> by default, and accepts events for any stream (no EventStreamConfig required locally). If you also want to validate against a schema you are currently developing, mount a local checkout of the schema repository into the container and point EventGate at it via the <code>schema_base_uris</code> setting in its config file. (The EventLogging recipe linked below shows that pattern.)
If you don't have MediaWiki Docker running yet, see [https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/master/DEVELOPERS.md DEVELOPERS.md] in mediawiki/core for setup.
== Producing analytics events ==
{{Note|For most analytics instrumentation, you should use [[Test Kitchen]].}}
From a technical perspective, analytics events are not special. However, because they have the potential to collect sensitive data, they require some special care.
<section begin="plan-experiment" />Consult the [[foundation:Legal:Data_Collection_Guidelines|Data Collection Guidelines]] before starting instrumentation to determine which risk tier your planned data collection activity falls under. If it is Low Risk, you do not need to submit a request for approval. If it is Medium or High Risk, you need to submit a request through [https://office.wikimedia.org/wiki/Legal,_Safety_%26_Security_Service_Center L3SC]. For affiliates (such as WMDE working on features and instruments that are deployed on Foundation infrastructure) who cannot access L3SC, please submit a request through the [[mediawikiwiki:Data_Platform_Engineering/Intake_Process|Data Platform Engineering intake process]] so that someone from DPE can submit a request to L3SC on your behalf. It is recommended to perform this step before starting instrumentation because during the review process you may learn that you cannot collect certain data you were planning to collect, so you will save yourself time by not writing code that you will have to remove. For more information about what your request should contain, refer to [[meta:User:MPopov_(WMF)/Sandbox/Measurement_plans_and_instrumentation_specifications|this draft guide on measurement plans and instrumentation specifications]].<section end="plan-experiment" />
Analytics schemas live in [[gitlab:repos/data-engineering/schemas-event-secondary|schemas-event-secondary]] under the <code>analytics</code> namespace. The schema authoring workflow is the same as for any other Event Platform schema. See [[Event Platform/Schemas]].
If not using Test Kitchen, analytics events are produced for you through EventLogging PHP or JS APIs.
=== Registering the stream with EventLogging ===
The EventLogging JS client also needs to know it should produce the stream. Add the stream name to <code>wgEventLoggingStreamNames</code> in [[gerrit:plugins/gitiles/operations/mediawiki-config/+/refs/heads/master/wmf-config/ext-EventLogging.php|ext-EventLogging.php]]:
<syntaxhighlight lang="php">
'wgEventLoggingStreamNames' => [
'default' => [
// ...
'analytics.thing_happened',
],
],
</syntaxhighlight>
=== EventLogging JavaScript ===
<syntaxhighlight lang="javascript">
mw.eventLog.submit( 'analytics.thing_happened', {
// $schema must match the $id of the schema version you're using.
$schema: '/analytics/thing_happened/1.0.0',
dt: new Date().toISOString(),
page_id: mw.config.get( 'wgArticleId' ),
action: 'clicked',
target: 'example-widget'
} );
</syntaxhighlight>
The first argument is the stream name. It must match what's declared in <code>wgEventStreams</code> in production. EventLogging sets <code>meta.stream</code> for you from this value.
=== EventLogging PHP ===
<syntaxhighlight lang="php">
$event = [
'$schema' => '/analytics/thing_happened/1.0.0',
'dt' => wfTimestamp( TS_ISO_8601 ),
'field_a' => 'value_a',
// ...
];
EventLogging::submit( 'analytics.thing_happened', $event );
</syntaxhighlight>
=== MediaWiki local development ===
The [[mw:MediaWiki-Docker/Configuration_recipes/EventLogging|MediaWiki-Docker EventLogging recipe]] is the most direct way to wire up a local EventGate together with EventLogging. It adds an <code>eventlogging</code> service to <code>docker-compose.override.yml</code> running the EventLogging devserver (which bundles EventGate), and gives you a <code>LocalSettings.php</code> snippet that points EventBus and EventLogging at it.
The recipe has two flavors. The minimal one fetches schemas from <code>https://schema.wikimedia.org</code>, which is fine if you are testing instrumentation against existing schemas. The one with local schema repositories mounts clones of <code>schemas-event-primary</code> and <code>schemas-event-secondary</code> into the container so EventGate validates against your in-progress schema work. Use that when you are developing the schema too.
Either way, every event ends up in <code>cache/events.json</code>. <code>tail -f cache/events.json</code> is the quickest way to confirm events are flowing.
Quick smoke test from the browser console:
<syntaxhighlight lang="javascript">
mw.eventLog.submit( 'test.event', {
$schema: '/test/event/1.0.0',
test: 'Hello from JavaScript!'
} );
</syntaxhighlight>
== Viewing produced events ==
Once events are flowing you can read them from [[Event Platform/EventStreams|EventStreams]], from the [[Data Platform/Data Lake|Data Lake]], or directly from Kafka.
=== EventStreams ===
There are three EventStreams instances:
* https://stream.wikimedia.org. Public. Only has streams that have been [[#Exposing events publicly|explicitly exposed]].
* https://stream-internal.wikimedia.org/. Internal WMF only. Has almost all streams declared in production stream config.
* https://stream.wikimedia.beta.wmcloud.org, beta / deployment-prep. Has streams produced in deployment-prep.
=== Data Lake ===
Almost all streams are ingested into the [[Data Platform/Data Lake|Data Lake]] within a few hours. The Hive table name is a normalized version of the stream name, in the <code>event</code> database. Our example stream lands in <code>event.my_namespace_thing_happened</code>. From there you can query with [[Data Platform/Systems/Hive|Hive]], [[Data Platform/Systems/Spark|Spark]] or [[Data Platform/Systems/Presto|Presto]], and dashboard via [[Data Platform/Systems/Superset|Superset]].
Events are retained for 90 days by default. See [[Analytics/Event Sanitization|Event Sanitization]] to extend that.
=== Directly from Kafka ===
Streams are produced into datacenter-prefixed Kafka topics. The <code>my_namespace.thing_happened</code> stream produces to <code>eqiad.my_namespace.thing_happened</code> and <code>codfw.my_namespace.thing_happened</code>. To get the full stream, consume both.
To find out which topics and clusters your stream is on, ask the stream config API:
<syntaxhighlight lang="bash">
curl 'https://meta.wikimedia.org/w/api.php?action=streamconfigs&streams=my_namespace.thing_happened' | jq .
</syntaxhighlight>
Most streams are mirrored into <code>jumbo-eqiad</code>, which is the easiest place to consume from on a [[Data Platform/Systems/Stat hosts|stat host]]:
<syntaxhighlight lang="bash">
# -C: consume mode
# -u: unbuffered
# -b: broker (any broker in the cluster; it'll discover the rest)
# -t: topic
# -o end: start at the end of the topic (only new events)
kafkacat -C -u -b kafka-jumbo1010.eqiad.wmnet:9092 \
-t eqiad.my_namespace.thing_happened -o end | jq .
</syntaxhighlight>
To pretty-print the last 5 messages instead:
<syntaxhighlight lang="bash">
kafkacat -C -b kafka-jumbo1010.eqiad.wmnet:9092 \
-t eqiad.my_namespace.thing_happened -o -5 -e -q | jq .
</syntaxhighlight>
For [[Test Kitchen]]-based instrumentation, events end up in <code>eqiad.product_metrics.web_base</code> (or a similar Test Kitchen topic), and you filter by <code>instrument_name</code> or <code>experiment.enrolled</code>:
<syntaxhighlight lang="bash">
kafkacat -C -b kafka-jumbo1010.eqiad.wmnet:9092 \
-t eqiad.product_metrics.web_base -o -5 -e -q | \
jq 'select(.instrument_name == "YOUR_INSTRUMENT_NAME")'
</syntaxhighlight>
{{Note|Kafka topic names are case sensitive. Capitalization is whatever you used in <code>wgEventStreams</code>. Note also that [[Event Platform/EventLogging legacy|legacy EventLogging]] streams are named differently and are not split by datacenter, e.g. <code>eventlogging_InukaPageView</code>.}}
== Evolving your schema ==
You can only make ''backwards-compatible'' schema changes, which in practice means adding new optional fields. To add a field, edit <code>current.yaml</code>, bump the version in <code>$id</code> (a minor version bump for an added field), and materialize:
<syntaxhighlight lang="shell-session">
$ npm run build-modified
$ git add jsonschema/analytics/thing_happened/*
$ git commit -m 'analytics/thing_happened - add link_text field, bump to 1.1.0'
</syntaxhighlight>
Then update the producer code to set the new field and the new <code>$schema</code> version URI. <code>jsonschema-tools</code> checks compatibility in CI, but you can run it locally too:
<syntaxhighlight lang="shell-session">
$ npm test
...
Schema Compatibility in Repository ./jsonschema/
analytics/thing_happened
Major Version 1
✓ 1.1.0 must be compatible with 1.0.0
</syntaxhighlight>
Once the new schema is merged and your producer code is deployed, events with the new field will be produced. Old events with the old version keep validating against the old schema.
=== Backwards-incompatible changes ===
In general, backwards-incompatible changes are not allowed, because there is no way to do them without coordination with all consumers. If you really need one, file a Phabricator ticket tagged with #Data-Engineering. The process will be manual and vary depending on the change.
== Per-wiki and beta stream config overrides ==
{{anchor|Per-wiki and beta overrides}}
Stream config is just MediaWiki config, so you can override it per wiki or per wiki group with the standard <code>+wikiname</code> merge syntax. This works for any MediaWiki based usage of stream config settings, e.g. EventLogging, EventBus, etc.
It does ''not'' work for EventGate or Data Lake ingestion settings, because those are is not wiki-aware: it always reads stream config from metawiki. Anything that affects validation or production (<code>schema_title</code>, <code>destination_event_service</code>) has to live in <code>default</code> or in a <code>+metawiki</code> override.
Example: sample 1/10 of events on enwiki only.
For [[mw:Beta Cluster|beta]], the same syntax applies in <code>InitialiseSettings-labs.php</code>, with one catch: the <code>default</code> section in <code>InitialiseSettings-labs.php</code> doesn't merge with <code>default</code> from <code>InitialiseSettings.php</code>. Only per-wiki overrides merge. So if your stream isn't yet declared in production, declare it under <code>+metawiki</code> in <code>InitialiseSettings-labs.php</code> so EventGate (which always reads from metawiki) can see it.
== Schema validation errors ==
Events that fail validation are not produced. Instead, EventGate produces a validation error event into a corresponding <code>*.error.validation</code> stream. Events sent through EventLogging end up in <code>eventgate-analytics-external.error.validation</code>, which is ingested into Hive as <code>event.eventgate_analytics_external_error_validation</code>.
Validation errors are also routed into Logstash. Useful starting points:
* [https://logstash.wikimedia.org/app/dashboards#/view/AXN5OoJu3_NNwgAUlbUT EventGate validation Kibana dashboard]
* [https://grafana.wikimedia.org/goto/x_3wAwJ7k EventGate Grafana dashboard] for per-stream validation error rates
The <code>*.error.validation</code> streams are streams like any other, so you can also subscribe to them from EventStreams.
== Exposing events publicly ==
Streams produced through the Kafka main clusters can be exposed on the public [[Event Platform/EventStreams|EventStreams]] service at https://stream.wikimedia.org. Before you expose anything, '''make sure the stream contains no PII and has been cleared against the [[foundation:Legal:Data_Collection_Guidelines|Data Collection Guidelines]]'''. Once exposed, the stream is consumable by anyone on the internet.
To expose a stream, add it to <code>allowed_streams</code> in [[gerrit:plugins/gitiles/operations/deployment-charts/+/refs/heads/master/helmfile.d/services/eventstreams/values.yaml|deployment-charts/helmfile.d/services/eventstreams/values.yaml]]:
<syntaxhighlight lang="yaml">
config:
# ...
allowed_streams:
# ...
- my_namespace.thing_happened
</syntaxhighlight>
Once that patch is reviewed and deployed, the stream shows up in the public [https://stream.wikimedia.org/?doc EventStreams API docs] and can be consumed at https://stream.wikimedia.org.
== Decommissioning ==
When you set up a stream, plan for how it ends. Schemas '''should not be deleted''', because there may still be older events referencing them in the Data Lake, but you can remove the stream-related code and config at any time to stop producing.
To decommission:
# Stop producing events from your code.
# Remove the stream's entry from <code>wgEventStreams</code> (and <code>wgEventLoggingStreamNames</code> if applicable).
# Mark the stream as decommissioned in its [https://datahub.wikimedia.org/search?filter_platform=urn:li:dataPlatform:eventstreams DataHub] entry.
# If you no longer need the schema, update its <code>description</code> in the schema repository (in a new materialized version) and note the deprecation in its README/CHANGELOG.
If the schema itself has to go (rare), coordinate with Data Engineering so the ingestion pipelines and alerts can be adjusted.
== See also ==
* [[Event Platform]]: concepts, architecture, background
* [[Event Platform/Schemas]] and [[Event Platform/Schemas/Guidelines]]: schema authoring and conventions
* [[Event Platform/Stream Configuration]]: stream config settings reference
* [[Event Platform/Producer Requirements]]: contract for generic producer clients
* [[Event Platform/EventGate]]: the HTTP intake service
* [[Event Platform/EventLogging legacy|EventLogging legacy]]: what changed when we moved off the old EventLogging backend
* [[Data Platform/Systems/Hadoop Event Ingestion Lifecycle|Hadoop Event Ingestion Lifecycle]]: what happens to events after Kafka
[[Category:Event Platform]]
nqte8blmav276r70y6d5veasne096ct
Analytics/Archive/EventLogging
0
460239
2421344
2026-05-30T13:52:37Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging]] to [[Obsolete:Analytics/Archive/EventLogging]]: pages are obsolete
2421344
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging]]
9rt4jrlyjd6hzfy2o54y3o6z0ph55o1
Talk:Analytics/Archive/EventLogging
1
460240
2421346
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging]] to [[Obsolete talk:Analytics/Archive/EventLogging]]: pages are obsolete
2421346
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging]]
rhtcfwcy8whf45k3wwaqe8rryhd1vim
Analytics/Archive/EventLogging/Administration
0
460241
2421348
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Administration]] to [[Obsolete:Analytics/Archive/EventLogging/Administration]]: pages are obsolete
2421348
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/Archive/EventLogging/Architecture
0
460242
2421350
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Architecture]] to [[Obsolete:Analytics/Archive/EventLogging/Architecture]]: pages are obsolete
2421350
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Architecture]]
563zn02ep8hvymxagfoiuv51v07t6wn
Analytics/Archive/EventLogging/Backfilling
0
460243
2421352
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Backfilling]] to [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]: pages are obsolete
2421352
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Backfilling]]
ixtg113mywcmt0chk7x1xzr8d83eidp
Analytics/Archive/EventLogging/Data representations
0
460244
2421354
2026-05-30T13:52:38Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data representations]] to [[Obsolete:Analytics/Archive/EventLogging/Data representations]]: pages are obsolete
2421354
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data representations]]
5e27mterkyfcdw6nxfeaiwk2jadfz6n
Analytics/Archive/EventLogging/Data retention
0
460245
2421356
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention]]: pages are obsolete
2421356
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data retention]]
let4mh654vwirmrjxg4e3q6zl0cnm93
2421411
2421356
2026-05-31T09:21:12Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Data Platform/Systems/Event Data retention]]
2421411
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention]]
ref2skyp95soto738tt04ioyi76knr2
Analytics/Archive/EventLogging/Data retention/AppInstallId
0
460246
2421358
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention/AppInstallId]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention/AppInstallId]]: pages are obsolete
2421358
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data retention/AppInstallId]]
g098zlte8xz747ht2x02wcbxbb5lwqg
2421412
2421358
2026-05-31T09:21:15Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Data Platform/Systems/Event Data retention/AppInstallId]]
2421412
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention/AppInstallId]]
icsxot2rofrwv3qotx36gy80y9wq2v6
Analytics/Archive/EventLogging/Data retention and auto-purging
0
460247
2421360
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention and auto-purging]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging]]: pages are obsolete
2421360
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging]]
logz4sjmp4q0j7ckg37rcbxp2b6uf38
2421413
2421360
2026-05-31T09:21:16Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Data Platform/Systems/Event Data retention]]
2421413
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention]]
ref2skyp95soto738tt04ioyi76knr2
Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId
0
460248
2421362
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId]] to [[Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId]]: pages are obsolete
2421362
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Data retention and auto-purging/AppInstallId]]
5u360trlcdj8if431eo7mlovadx0u32
2421414
2421362
2026-05-31T09:21:18Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Data Platform/Systems/Event Data retention/AppInstallId]]
2421414
wikitext
text/x-wiki
#REDIRECT [[Data Platform/Systems/Event Data retention/AppInstallId]]
icsxot2rofrwv3qotx36gy80y9wq2v6
Analytics/Archive/EventLogging/EventCapsule
0
460249
2421364
2026-05-30T13:52:39Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/EventCapsule]] to [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]: pages are obsolete
2421364
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/EventCapsule]]
ekk4tm809d58l8akk4jfh8oiu6c67qj
Analytics/Archive/EventLogging/How to
0
460250
2421366
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/How to]] to [[Obsolete:Analytics/Archive/EventLogging/How to]]: pages are obsolete
2421366
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/How to]]
9sr2c5npj9l0cv2hnwpacip5tg5e9j6
2421415
2421366
2026-05-31T09:21:19Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421415
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/Archive/EventLogging/Monitoring
0
460251
2421368
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Monitoring]] to [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]: pages are obsolete
2421368
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Monitoring]]
car9duvgoadqdlvtghdnzp1qalu0ntt
Analytics/Archive/EventLogging/New pipeline
0
460252
2421370
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/New pipeline]] to [[Obsolete:Analytics/Archive/EventLogging/New pipeline]]: pages are obsolete
2421370
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/New pipeline]]
2ciku8bkf5w5r7buieqns5kw1km5ydv
2421416
2421370
2026-05-31T09:21:21Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Analytics/Archive/EventLogging pipeline]]
2421416
wikitext
text/x-wiki
#REDIRECT [[Analytics/Archive/EventLogging pipeline]]
shrtt6t6y4gx8tbdte7pveq3ou9k5y1
Analytics/Archive/EventLogging/NotErrorLogging
0
460253
2421372
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/NotErrorLogging]] to [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]: pages are obsolete
2421372
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/NotErrorLogging]]
ei45szt89ywomi0djg2wti5uaglaiec
Analytics/Archive/EventLogging/Oncall
0
460254
2421374
2026-05-30T13:52:40Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Oncall]] to [[Obsolete:Analytics/Archive/EventLogging/Oncall]]: pages are obsolete
2421374
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Oncall]]
r0ue4qj0670o7rq6fb7x91rp40kqfk9
2421417
2421374
2026-05-31T09:21:22Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/Administration]]
2421417
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Administration]]
fona8eke3bmcrb3m0dvd5okz8pgn8cl
Analytics/Archive/EventLogging/Outages
0
460255
2421376
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Outages]] to [[Obsolete:Analytics/Archive/EventLogging/Outages]]: pages are obsolete
2421376
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Outages]]
p63efhynozsct7zet3mipmgjkr8ktu7
Analytics/Archive/EventLogging/Performance
0
460256
2421378
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Performance]] to [[Obsolete:Analytics/Archive/EventLogging/Performance]]: pages are obsolete
2421378
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Performance]]
qjrz2uod4uyospuds419rd1uxhkdq3p
Analytics/Archive/EventLogging/Publishing
0
460257
2421380
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Publishing]] to [[Obsolete:Analytics/Archive/EventLogging/Publishing]]: pages are obsolete
2421380
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Publishing]]
nmbj4rbsh37le6a3rekioqs0hzpkwp1
Analytics/Archive/EventLogging/Sanitization vs Aggregation
0
460258
2421382
2026-05-30T13:52:41Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Sanitization vs Aggregation]] to [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]: pages are obsolete
2421382
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
ozb5iyz2xjb7w6y5xu3x21aukwi414q
Analytics/Archive/EventLogging/Schema Guidelines
0
460259
2421384
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Schema Guidelines]] to [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]: pages are obsolete
2421384
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Schema Guidelines]]
ds2ev3dw8t6jov0i0t75yawt9o5qhae
Analytics/Archive/EventLogging/Sensitive Fields
0
460260
2421386
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/Sensitive Fields]] to [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]: pages are obsolete
2421386
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/Sensitive Fields]]
k9nkblptw4pz7l3cvunzjff64jjqh0w
Analytics/Archive/EventLogging/TestingOnBetaCluster
0
460261
2421388
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/TestingOnBetaCluster]] to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]: pages are obsolete
2421388
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Analytics/Archive/EventLogging/TestingOnBetaLabs
0
460262
2421390
2026-05-30T13:52:42Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/TestingOnBetaLabs]] to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaLabs]]: pages are obsolete
2421390
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaLabs]]
1vt43osvz5qqjfa2ljjcutp2aj33hyo
2421418
2421390
2026-05-31T09:21:23Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
2421418
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/TestingOnBetaCluster]]
qub4iquiaon5ht2h3zzd43rkywtbf6a
Analytics/Archive/EventLogging/User agent sanitization
0
460263
2421392
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Analytics/Archive/EventLogging/User agent sanitization]] to [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]: pages are obsolete
2421392
wikitext
text/x-wiki
#REDIRECT [[Obsolete:Analytics/Archive/EventLogging/User agent sanitization]]
lq89feu0hym2433yawxpz27l176t1h0
Talk:Analytics/Archive/EventLogging/Administration
1
460264
2421394
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Administration]] to [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]: pages are obsolete
2421394
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Administration]]
d8aoaqa0dou1rrttrh352hsovjjill0
Talk:Analytics/Archive/EventLogging/Data retention
1
460265
2421396
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Data retention]] to [[Obsolete talk:Analytics/Archive/EventLogging/Data retention]]: pages are obsolete
2421396
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Data retention]]
igxssdhex4mrgromq3veztkffwrmaqh
2421501
2421396
2026-05-31T09:23:04Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Talk:Data Platform/Systems/Event Data retention]]
2421501
wikitext
text/x-wiki
#REDIRECT [[Talk:Data Platform/Systems/Event Data retention]]
ftxupil1ga4lxpt6tkgkh9sbfmmj5p6
Talk:Analytics/Archive/EventLogging/Data retention and auto-purging
1
460266
2421398
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Data retention and auto-purging]] to [[Obsolete talk:Analytics/Archive/EventLogging/Data retention and auto-purging]]: pages are obsolete
2421398
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Data retention and auto-purging]]
aau574hsmoek9bnro520nmze5b1wxuj
2421502
2421398
2026-05-31T09:23:05Z
Dexbot
30554
Bot: Bot: Fixing double redirect to [[Talk:Data Platform/Systems/Event Data retention]]
2421502
wikitext
text/x-wiki
#REDIRECT [[Talk:Data Platform/Systems/Event Data retention]]
ftxupil1ga4lxpt6tkgkh9sbfmmj5p6
Talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation
1
460267
2421400
2026-05-30T13:52:43Z
Ottomata
88
Ottomata moved page [[Talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]] to [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]: pages are obsolete
2421400
wikitext
text/x-wiki
#REDIRECT [[Obsolete talk:Analytics/Archive/EventLogging/Sanitization vs Aggregation]]
lzpyfc1dpi1gsu0axf5047tshmgp3fh